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1. I^o dement in the straotuie of our national education 
oooupies at the present moment more publig attention than our 
system of examinations. It guards the gates that lead from 
dementary education to intermediate and secondary education, 
from secondary education to the Universities, the professions, 
and many business careers, from the elementary and middle 
stages of professional education to professional life. 

2. Quite apart from the safeguards imposed by Acts of Parlia- 
ment and Government authorities, a whole congeries of 
examioations has sprung up in the last century, created by 
private and public bodies*. Examinations have become a 
familiar topic in our newspapers and in our homes. The 
examination system has grown to be an important dement, not 
only in our education, but in the whole social system of our 
country ; and the interest of many other countries in this matter 
is not less than our own. 

8. The investigations on examinations of which this pamphlet 
is a summaiy are the outcome of an Intemationd Conference 
on Examinations held in May, 1931, at Eastbourne, under the 
auqpioes of ibe Carnegie Corporation, the Carnegie Foundation, 
and the International Institute of Teachers College, Columbia 
University. The countries represented at the Conference were 
(in dphabetieal order) England, France, Germany, Scotland, 
Switzerioud, and the United States^ As a result of that 

* In a G«BiJfpeot’n> in piepantion hr the Committee there wiil appear 
between ISO and 200 names of such bodies, ezolusiTe of Universities and 
Xioeol Edneation Anthoiities. 

* Tbs Beport of the Eastboome Conference on Examinations, edited by 
Ptedeasor Paul Monroe, Director of the Intemationai Institute, was 
pnblbhed by the Borean of Publkations, Teachers College, Colombia 
Univendty, New York City, in 1931. 

The repretentatim from the United States at the Conference were as 
foQown:— 

Ur. C. H. Jttdd, Dean of the School of Education, University of 
Chicago^ 

Ur. Predeciek P. Keppcl, President of the Carnegie Corporation, 
Hew York City. 

Ur. fanl Mtkinoe, IHreotor of the Intemationai Institute, Teachers 
OnBsga, Golanihia Unirertity. 
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Cionferenoe oommitteeti wore set up iu all the European countries 
above-named. Each o£ these committees rcceiv^ a grant for 
three years from the Oamogie Corporation through the Inter- 
national Institute, and each of them reported independently to a 
second International Conference held in J une, 1935, at Folkestone, 
under the same auspices as the Conference held at Eastbourne. 
The Committees have done their work on independent lines and 
have reported soparatnly. This pamphlet is substantially 
identical with the report presented by the English OommittSb 
to the Folkestone (/'onforonco, and it is published in its present 
form in accordance with a wish expressed at that Cunferenoe. 

4. The English Committee consisted of the following : Sir 
Michael Sadler, K.O.S.l. (Chairman), Dr. P. B. Ballard, Dr. C. 
DeMe Bums, Professor Cyril Burt, Sir Philip Hartog, K.B.E. 
(Director), Professor Sir Percy Nunn, Professor C. ypearraan, 
F.R.S., and Professor Graham Wallas. The Committee suEered 
a great loss in 1032 by the death of Professor Graham Wallas, 
who was replaced by Professor Godfrey Thomson, a member of 
the Scottish Committee. Professor H. JR. Hamley and Professor 
C. W. Valentine joined the English Committee in the present 
year*. The address of the English Committee is 1, Plowden 
Buildings, Temple, Loudon, E.C.4. 

Dr. HeuTT' Suzzallo, President oX the Carnegie Foundation, New 
York City. 

Dr. Edward L. Thorndike, Professor of Education, Teachers CoUegu, 
Columbia University. 

*The membership of the other Committees is shown below: 
PlUKCK - 

If. A. Deaclos, Direrteur-adjoint do I'OfSre National dea Unirersitda 
et Ecolos Franeauee {Pretidenll, 

M. Barrier, Adjoint an Directeor de I’Enseigaement Primaire 

U, Bcuglt, Directeur-adjnint de I'Eoole Normale BupCrieure. 

M. Gastiiiel, Inspecteur GCuSral do ITnstruction PnbUqae. 

M. UaugioT, Maltre de ConfArencee k la FaoulM des Hmenoe* de Paris. 

U. Luc, Direeteur-adjoint de rEnseigneioeat Techniqua. 

The original Committee included : 

M. Charles Maurain, Doyen da la Faculty dea Boianuaa de T Univnmta 
da Paris (who resignad on aeeonnt of tha pressure of other duttea). 

If. Cope, President du Syndloat National das Profesaeom dee Lyedas 
da Gar$ons et da I’Enseiinemeint Saeondaira Fimlrdn (sinaa 
deceased). ^ 

GuoiasT— - 

Professor Erich Hylhit Mirdateriatrat im MlnistaRttw ftti Kvnit, 
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8. The Committee engaged Br. E. G. Rhodes, Reader in 
StatistioB in the University of London, to act as their statistioian. 

6, Touching education and social life as they do on so many 
points, the problems of examinations are many and varied. The 
Committee have published an English Bibliography of Examina- 

WiBsenscbaft, und Volksbildung in Pieusson ; Frofoseor an dei 
PSdagogischen Akadentio, IlalTo. 

^ Dt. Bobcrb Ultoh, Ministerialrat im Ministerium fdr Volk^hildung 
in Sachsen. 

The original Committee included also : 

Professor Dr. Carl Becker, Minister a.D. fur Kimst, WisBcusohatt, 
und Volksbildung in Proussen; Professor an der Universildt, 
Berlin (since deceased). 

Dr. Otto Bobcrtag, University of Berlin (since deceased). 
Scotuhh— 

William Boyd, M.A., B.Sc., D.PhlL, Lecturer in Education, Glasgow 
University. 

Shepherd Dawson, M.A., D.Se., Lecturer in Psychology, Jordanhill 
Truning Collie, Glasgow (^nce deceased). 

Professor James Drover, M.A., D.Phil., Professor of Psychology, 
Edinburgh University. 

Thomas Henderson, B.So., F.E.I.S., Hon. Secretary of the Soottisb 
Council for Besearch in Education. 

W. A. F. Hejihum, M.C., M.A., B.Ed., Director of Education to the 
Ayrehire Education Committee. 

Professor W. W. McClelland, M.A., B.Sc., B.Ed., Professor of 
Education, St. Andrews Umversity, 

J. Mackie, M.A., D.Sc., F.B.S.E., Head Master, Leith Academy. 
Bobert E. Buak, M.A., B.A., Fh.I>.. I/ecturer in Education, Jordan- 
full Training College, Glasgow : Director to the Scottish Council 
for Beeear^ in Education. 

J. C. Smith, ('.B.E., M.A., D.Litt., formerly Senior Chief Inspector 
of Schools, Seotrish Education Deportment. 

PretesBor Godfrey H. Thomson, Ph.D., D.Sc., Professor of Educa- 
tion, Edinburgh University. 

SwmaKUMO- 

M. PiMie Oovet, Professour A rUnivorsitd de GenAve ; Direoteur 
de riustitut Univeisitaire des Sciences de I'Eduoation, OenAve. 
l>t. Brenner, Dlrocteur da Lelirersemiiiar, Bfile. 

M. Edouard Clapaiids, Profosscur de Psychologie A rUniversltd do 
GenAve i Dlmteur de I'lnstitut Jeau-Iacques Rousseau. 

M. Bohert Oottnms, Direoteur d'Ecoles, Troinex, GenAve (Dr. Soe.). 
Dr. Cbariee Junod. 

M. Alb«rk Mak-he, Consetiler aux Etats ; Professeur A I'UniveisitA 
do (ienAve, 

]|L Jean Piaget, Directeux du Bureau International d'Eduoation, 
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lions {1900-32)*, which showH how much has been written on the 
subject in this country during the first third of the century. 
The Committee arc also publisWng a volume of Kami/s on 
Examinations, dealing with a number of aspects of the subject, 
which will appear soon after this pamphlet, and a Conspectus 
of Examinations in Great liritain. and Nortlmn Maud, which 
will appear later. But the main work carried out for the 
Committee will bo recorded in a volume entitled The Marks of 
Examiners, now in courao of i>rinting, of which the presenc 
pamphlet is a summary. 

7. The object of the investigations to be described may he 
explained very simply. Professor F. Y, E<lgeworth, many years 
ago, found that the marks allotted independently by twenty-eight 
different examiners to a piece of Latin prose varied from 45 to 
100 per cent. In the United States, Messrs. Starch and Elliott, 
and, in France, M, Laugier and Mile. Weinberg have found 
similar results, but no aystematio comparison has hitherto been 
published of the marks allotted by a number of different 
examiners, all experienced and qualified for their task, to seta of 
scripts (answer-books) actually written at public examinations. 
Both the English and the French Committees have attacked 
this subject, and the present pamphlet gives a fairly extended 
summary of the English resrdts and a brief one of the French. 
These results are similar in the two countries, and equally 
disquieting. It is clear that the part played by chance in the 
verdicts given at different examinations on which careers depend 
must often at the present moment he a great one. The Com- 
mittee are well aware that the consideration of borderlinn cases by 
ezanunation authorities does materially diminish the chances 
of a candidate being ^vTongly rejected ; but it must be pointed 
out that candidates may be placed in error bttlow the 

Geu^re; Frofeaaenr extraordinsiro 4 rUoiremtd do i<oiteve; 

Co.directeur de I’lnstibut Jean-Jaeqaes liou/Moau. 

Dr. W. Sobohaua, Sebweizeriaebe Erziehunx* JEtuadaehau. Knox- 

Ungen, Tbuigovio. 

Dr. Ida Soinaui, Eemiaar, Berce. 

Dr. Hans Btettbaohnr, Lehraintkune, UniversiUt, Zuriclu 
M. Teodoro Valentini, Profcsieur, Beuota NnimaUi, Looamo, Tessiiu 
* An BngUth Bibliography of EamitMlion* (1000<1D33), bjir Mary C. 
Champneys, with a Foreword by Bir Miohaol Sadlsr and Sir Fbilip Hartoc 
(MaemUlan & Co.. Ltd.), 1934. 
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“ borderline.” Again, it must be remembered in the interest 
of the public, to •whom an examination certificate means a 
certificate of efficiency, that candidates may now by chance 
obtain such certificates when they should by rights be rejected. 

8. Of all the results recorded by the English Committee perhaps 
the most disturbing are those recorded in the investigation on the 
marking of School Oertifioate History scripts. It was found that 
when fourteen experienced examiners re-marked independently 
SIteen scripts which had all received the same moderate mark 
from the examining authority by which they were furnished, 
these examiners, between them, allotted over forty different marks 
to the several soripts. It was found, further, that when these 
examiners re-marked onoe more the same scripts after intervals 
of from twelve to nineteen months, they changed their minds 
as to the verdict of Pass, Fail, and Credit in 92 cases out of the 
total of 210. Gearly a test of this kind cannot inspire oonfidenoc. 

0. Our investigations show that the employment of boards of 
examiners instead of individual examiners, though it diminishes, 
does not remove the element of chance in examinations, and that 
boards, as well as individuals, may disagree in their verdicts. 
The element of chance in examinations still subsists to a dangerous 
d^ree in the subjects which have been investigated by the 
Committee. 

10. The question may at once be asked : Should examinations 
be abolished 1 If not, what remedies can be suggested 1 

The Committee are clearly opposed to the root and branch 
policy. They are of opinion that examinations as a test of 
effidenoy ate necessary. They are further of opinion that, in 
addition to those oxammations which yield identical results 
when applied by difierent examiners {i.g. “New Type" or 
" Objective ” examinations), the traditloiml “ essay ” oxamina' 
tion should be preserved. Bat they hold that it is as im- 
praoticable to recommend an a priwi cure for the defects of tho 
pesent examination system as it would he to recommend an a 
euro for a disease. It is only by careful and systematic 
«K|Hwimeiat that methods of examination oan be devised not 
hablo to the dktireesmg uncertamties of the present system. 
Ho doabt investigations like -those recorded by our Committee, 
and adidlnisCiiative experiments in allowmg teaobera, in 
sei^nnelkm with Government or University inspectors, to “ brand 
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thoM ovm herrings,” would involve expenditims hut such 
cxpenditruro and experiments would be Justified in the* public 
interest. 


'fhfi Committee desire to acknowledge their dp(*p oldigation to 
the various examination authorities by whom they have been 
famished with the scripts which formed the material for their 
investigations, or by whom they have been assisted in other 
ways, and to the examiners who marked the scripts or took pa)^ 
in tho viva voce examination. Without the ct)rdiai assistance 
both of examination authorities and of examiners, it would have 
been impossible fur the (iommlttee to carry out their investiga- 
tions on the lines which they had planned. 

In oonolusion, tlie Committee wish to exprt'ss their warm 
appreciation of the generosity and initiative of the Carnegie 
Corporation, the Carnegie Foundation, and the International 
Institute of Teachers College, Columbia University to which 
this Committee and the parallel Committees in other countries 
owe their existence. 



Pabt I — General 
IntrodvMion 

1. The main object of the investigations was to test the con- 
cxirrence of the maorking of a number of examination scripts by 
a number of independent examiners, or, in certain oases, by two 
independent boards of examiners. 

2. In carrying out the investigations, the following general 
principles were observed : — 

(i) The scripts investigated were all actual scripts which had 
been written by candidates in the course of an ordinary examina- 
tion. It was only after long and delioate negotiations with the 
various bodies that the actual scripts could be seoured. 

(ii) The following examinations wore selected by tbe Committee 
for the purpose of the investigations, as important and typical : 

(а) School Ceiiijicale Examinatione, for which there are 
between 60,000 and 70,000 candidates every year. 
These are the School Leaving Examinations taking place 
at tbo age of about 16, the passing of which under certain 
conditions qualifies for entrance to a university and to 
a number of professions. A School Certificate is also 
reqiiired as a condition of engagement by many business 
men. 

(б) Sjiecial Place ExaminaHans. These are the examinations 
held at tbe age of between 10 and 12, on the results of 
whioh children in elementaxy schools gain admittanoe to 
oenfxsl schools or secondary schools. The number of 
entries every year is estimated at from 400,000 to 
600.000. 

(c) A VeiHUnge Schoktrship examination at one of the older 
umversittee in English Essay. 

(d) A Unmraity Sonottrs examination in Mcdhematki. 

(e) A Unwersity Bonours examination in History, 

(ih} Every mark on &e soripts made by the original examiners 
woe oompletely removed before they were cdroulated or 
photc^phed. 

(iv) The examinee by whom the papers were marked (men 
aaod women) were in every case examiners with erperienoe of the 
bind ol eillimiixatikm inveedgated. Ih four of the investigations 
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on iSchool Certificate examinations the oxamineis in the rsrioud 
subjects were chosen in each case from the panel of a single 
examining body (other than the body which had supplied the 
scripts).^ The examiners for the College Entrance iSoholarship 
Essay scripts and for the Unireraity Mathematical Flonours 
scripts were in either case examiners of the university for which 
tlio scripts were written. For the History Honours scripts it was 
im[>o8sibio to secure a sufficient number of examiners from the 
same university, and tho 17 examiners oonoemod wore choswn from.- 
nine difforont universities and included nine university pro- 
fessors. 

(v) 'Fho time allowed for the correction of tho soripts was, 
aa a rule, the time desired by the examiners conoomed. It may 
bo fairly said that the soripts were corrected under less pressure 
in respect of time than ordinarily prevails at an examination, 
BO that the marks may be regarded as expressing the deliberate 
opinion of the examiners oonoemed. 

(vi) Every precaution was taken to ensure that no answer 
was overlooked by an examiner, and in any case of doubt the 
script was return^ to the examiner for reconsideration. 

(vii) The examiners were all paid either in accordance with 
the usual scale adopted for the marking of soripts of the same 
kind, or, in certa^ oases, on a scale slightly higher. The 
Committee regard the payment of the examiners as an essential 
feature of the investigation. It might have been possible to 
seouro the voluntaiy help of competent examiners, but marking 
carried out by voluntary helpers would have been carried out 
under conditions different from those of a real examination. In 
an investigation of this kind it is to be remembered that tho actual 
task of marking examination scripts is for most examiners 
wearisome, and the psycholo^cal oondition of a person who is 
unpaid for performing such work Is likely to bo different from tho 
oondition of a person who is adequately paid. 

(viii) The marks were all analysed by Dr. E. 0. Rhodes, 
Reader in Statistics in the University of London, and the results 
hare been prepared for publication by the Director and Dr. Rhodes 

*In the investigation on School Certificate English conducted under the 
auapiooa of the Durham University School Examinations Board, of irhieh 
vu are printing and extending the results, the examiners rrere not all 
ohosen from the panek of the same examiniiig body. 
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and submitted to the Committee. The volume oontaining 
tlie details of the investigations will extend to about 260 
pages, and will comprise two sections : Section I, oontaining 
the important details and figures for each investigation, and 
Section II, containing a more elaborate statistical analysis 
by Dr. Rhodes, in which it is attempted to separate the 
difierenoes of marking due to difference of the standards adopted 
by the individual examiners from the random deviations 
<< 3 |{, each examiner from his own standard. It will include 
additional memoranda by Professor Cyril Burt and Dr. Rhodes 
on Hm most suitable methods of analysis for data of this kind. 

(lx) The Committee are anxious that their investigations 
should not be interpreted as a mitioism of any particular body. 
No mention has been made in these investigations of the marlb 
allotted to the scripts by the original examining bodies. 

3. The Committee believe that, in view of the precautions taken, 
the disorepancies between the marks of the different examiners 
afford on indication of the element of chance in examinations 
as they are at present conducted. The investigations show 
how a cdmnge in the sriection of particular examiners, from a panel 
of persons who are all experienced and regarded as all well 
qualified, would tend to affect the fate of individual candidates. 

4 . Besides the iuvestigatiom into written examinations, the 
Committee carried out one investigation of a pa^cularly 
interesting kind into the concurrence of the marking of two 
boards of examiners at an interview of the same kmd as that 
held at C'ivil Service examinations, with the object defined in 
piota. &l(d) below. 

Tbe results of the different investigations are briefly sununarised 
in the following sections. 

School Caiificate History 

6. Pifteen scripts were selected which had been awarded 
exactly the same ** middling *’ mark by the School Gertifi.oate 
authority oonoenied, and these scripts were marked in t\mx and 
loffepeojdAQtty by IS examiners, who were asked to asedgn to them 
betth marks and awards of failure, Pass and Credit. After an 
intiRval whioh varied with the different examiners, but was not 
km than 12 nor more than 19 months in any instance, the same 
aoripts, being renumbmed, were marked again by 14 out of the 

IS e x a mhier a (one exainitter being unable to serve again). 
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The 14 examiners assured us that they had kept no record 
of their previous work and this was indeed obvious from the results, 

fi. Whereas the scripts had been all allotted the same moderate 
mark by the original examining body, they were allotted by the 
16 examiners on the first occasion 43 different marks out of a 
maximum of 96, varying from 21 to 70. On the second occasion 
the total number of the different marks was 44, and the marks 
varied from 10 to 71. There is n{» space here to analyse the 
differences of the marks allotted by the various examiners t*** 
the same candidates. In one ease the difference was 30 marks 
out of the maximum of 90. 

7. Perhaps the most striking feature in the investigation is 
this: On each occasion the examiners awarded not only numerical 
marks, but the verdict of FaUure, Pass or Credit. In comparing 
the two sets of awards we can only take into account the 

14 examiners who acted on both occasions. On each occasion 
the 14 examinerb awarded a total of 210 verdicts to the 

15 candidates. It was found that in 92 cases out of the 210 the 
individual examiners gave a different verdict on the second 
occasion from the verdict awarded on the first, 

8. In nine cases candidates were moved two olassos up or down. 
One examiner changed his verdict in regard to cig^t candidates 
out of the fifteen. Yet he only varied his average by a unit, 
and he awarded the same number of Failure marks, one less Pass, 
and one more Cfredit. Such irregularity of judgment is not only 
formidable, but it is one which would not he detected by any 
ordinary anal^'sis. Statistically his results on the two ocoasiona 
were almost the same, but the fate he allotted to half the oandidates 
was different. 

In some cases the examiners altered their general standard 
on the second occasion. One examiner moved S candidates 
down a class, and one down two ciasses. Another examiner 
moved 7 candidates down a class. Of the 14 examiners there is 
only one who was exceptionally sti'ady and whose numerical 
mark never varied by more than 7 out of lOO. 

9. It may well be asked, in view of the extreme ilifferences 
of these xesutts, what validity can be attached to the markmg 
of Schodi Certificate History papers. It is porfootly true that, 
as Professor Spearman has pointed out, vaUcUty and ** rglUabiUty '* 
or eooionrreaoe of marking are by no means equivalent tcmui, 
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but no process of measurement can be valid when it yields such 
discrepajit results in the hands of the same examiners on two 
different occasions. 

School G&riificcste Latin 

10. This investigation dealt with two 2-hour papers, of which 
the marks were added together. The scripts of 16 candidates 
were so selected that the candidates had obtained at the original 
examination exactly the same moderate mark for the two papers 
■Twmbined. 16 examiners were appointed, of whom two were 
treated as Chief Examiners for the drafting of a marking-scheme. 
The examiners were furnished with examination papers (though 
not with “ trial-scripts *’ as in later experiments). The marking 
scheme was finally settled after correspondence with all the 
examiners concerned on all points regarded as contentious. The 
correspondence showed that six of the examiners preferred more 
detailed instructions in respect of unprepared passages than the 
other seven, and it was decided to adopt two marking-sohemes 
to meet the wishes of the different examiners concerned. The 
examiners were therefore divided into two Groups— Group I, 
oonristing of six examiners who used Scheme I, and Group U, oon- 
sisting of seven examiners who used Scheme II. The two schemes 
differed only by the addition of 19 more detailed instructions in 
respect of impiepared passages from and into Latin in one paper 
than the other. Of these, 10 were allotted to a question which 
was only selected by a single candidate. The maximum for 
each question and the total ma-riTmiTn were the same in the two 
Schemes. It is obvious that the two Groups cannot strictly he 
regarded as analogous to two independent Boards, who would 
no doubt have adopted marking-sohemes differing far more widely, 

11. Whereas the fifteen couples of scripts had originally been 
aorigofid the same moderate mark, under Scheme I they received 
from the 6 examiners concerned 24 different marks ranging from 
2$ to 66 ; and under Scheme H they received from the seven 
examiners oonoeraed 28 different marks ranging fr om 33 to 61. 
The total number of different marks allotted under the two 
sdiemes was 31 and the total range from 28 to 61. It is quite 
obvious that in epHe of the detailed Tmuffenig sdxemes the individual 
exaztunem adop^ rteey different standai^. 

19. A d^itai^ eiaal;i^ has been made of the marks for the 
different questitoui. These questions were originally marked 
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on a higher scale, which was reduced so as to yield a maximum 
for the two papers of 100. It is remarkable tliat the diiferenoe 
between examiners varies very m\ich with the candidate. Thus 
for one oandidato the marks for a quc-stion of wUi<*h the original 
maximum wa.H 6() (translation from (liesar), the extreme range of 
the marks allotted by the 13 examiners is only 0 marks, whereas 
for another candidate the extreme difference was 2K marks or 
47 per cent, of the maximum. In the case of srmie questions on 
aocideuce the difference between the marks is very small. 

Hrhool OertificatP. French 

13. llte Horipts investigatod were written as au'^wers to two 
2-hour pajjers. Two independent Hoards were set up, each con- 
sisting of a Ohief Examiner and six other examiners. The 
examining body supplied at our request 160 scripts altogether, 
chosen so that the marks allotted by the original examiners 
oorresponded to a normal frequency distribution and ranged from 
the worst to the best. Of these 60 were selected, corresponding 
to the same normal distribution, for final marking, and were 
reproduced photographically. The others served ae ** trial- 
scripts.’' 

14. Each Chief Examiner drew up his own marking-scheme, 
discussed it with his Board in the ordinary way and, after settling 
his scheme, gave each of his Board a number of trial-soripts to mark 
so as to control the methods of marking of each examiner. As a 
result of this process the two Boards quite independently adopted 
oomplex schemes, which were, however, obviously the result of 
a common tradition. Board I gave 6 general directions, and 
640 detailed directions for Paper I and 200 detailed directions for 
Paper II, mainly concerning points of English and French in 
translation. Tho scheme of Board II included 700 detailed 
items for Paper I and 300 for Paper 1 1. These detaOod directions 
did not require any appreciable effort of memory on tbo part of 
the examiners. Althougli the general methods used by the two 
Boards were obviously the same, the detailed directions were 
in a number of caws different, and in some 60 oases were actually 
conflicting. Each Board settled its own standard for Failure. 
Pass or Credit. The Chief Examiner, after seeing samples of the 
trial markings of each examiner, gave instructions fu{^his marks 
to be anused or lowered in some particular way. 
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16. The rctums of the individual examiners showed that the 
number of Failures varied from 6 to 16, of Passes from 7 to 16, 
of Credits from 21 to 30, and of Disllnctions from 1 to 9. Agree- 
ment was reached between the 6 examiners of Board 1 on the 
awards to only 27 oaadidatea out of 60, and agreement was reached 
between the examiners of Board n in regard to only 30 out of 60. 
The average range (the difiermoe between the highest and lowest 
mark allotted by the different examiners to the same script) for 
'iioard I was 10.0 marks and for Board TI 7.8, out of one hundred. 
The extreme range was 19 for Board I, and 16 for Board II. 

16. One of the interesting features of the marking of the two 
Boards was that the average mark of Board I for a piece of 
dictation expressed as a fraction of the maximum was 14 per cent, 
higher than the corresponding average mark of Board 11, and that 
the average mark of Board 1 for a question involving Lranslation 
from Fugliah into French, stressed as a fraction of the maximum, 
was about 24 per cent, lower than the corresponding average of 
Board 11. The maxima were approximately the same for the two 
Boards. 

When we consider the average marks for the two Boards of the 
smipts treated as a whole, such diSerences disappear ; but the 
fate of individual candidates depends on these differences which 
a similarity of general results effectively conceals. A candidate 
who did poorly in dictation would be more leniently treated 
1^ the examiners of Board I. A candidate who did poorly in 
tnimtlation from English into French would be more leniently 
treated by the examiners of Board 11. Moreover, the fate of a 
candidate might depend on the particular member of the Board 
to whom his script is assigned for marking. 

School CerUflccOe Chmiairy 

17. The prooeduro in the case of Chemistry was almost identical 
with that adopted in the case of French, hut the number of final 
soripts ejected for the final marking was only 30 instead of 60, 
as the average length of the scripts was considerable. 

BcNued I in its markiitg-Bohemo gave about 96 detailed directions 
to the oxamioetB, and Board II about 86. For oertain details 
the two Boards gave the some marks, for others they gave 
luarkaapjnwewtdy differ^ 'Die differences between the Boards 
would no jdonht have greater but for the fact that the 
eataiidatea were instiraoted to select any six questions out of 
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eight, 80 that it was necessary to allot identical or almost identical 
maxima to the difEerout questions. 

18. In the returns of the individual examiners of the two 
Boards, taken together, the number of awards of Failunt varied 
from 6 to 10, of Passes from 2 to 11, of Credit from 9 to 18, and of 
Distinction from 0 to 8. No more adjustment of averages would 
remove such discrepancies between the distributions of awards 
by individual examiners. The difforenoes between the two 
Boards in respect of different questions is less than in the (tuso dr* 
French, but for one question, dealing with a simpUf quf.slion of 
chemical theory, the average mark for Board 1 was 3.'1 isw cent, 
of tho maximum, while the oorrf^sponding average for IJoard II 
was 46. It is only in regard to this point that wo get anything 
comparable to tho remarkable difforcncos which wen^ found 
between tho two French Boards (see para. 16 above). Neverthe- 
less it is true that, as in French, the fate of a candidate depends 
very largely on the personnel of the Board, and on the particular 
examiner to whom his script is assigned. The avemge range of 
marks waa 10 for Board I, and 10.9 for Board 11, out of one 
hundred. The extreme range was 23 for Board I. and 28 for 
Boaid 11, 

School Certificaie Eiiglish 

19, We include in this Report details of an investigation on 
School Certifleato English, carried out just before our own work 
was begun, on behalf of the Durham University Examinations 
Board. It was on lines similar to those which wc have adopted, 
and yielded similar results. An analysis of the figures by 
Mr, C, Roberts and Professor H. V. A, Briscoe was published 
by permission of the Durham Board (in The J..1/,A. for Dee. 
1931, and Feb. 1932). The ^taUed mark-sht'ets were later 
furnished to us by the Board, and wo have made use of three in 
both parts of this Report. The whole of the Engliah wtripts from 
one school, 48 in number, were marked separately by seven 
examiners, A, B, G, D, E, F, and G, selected from tlie panels of 
four different Bohool Coitifioate authorities, who had tlio reputation 
of being specially exijcricnced and trusted examiners. Of these. 
0, D, and E were ordinarily engaged by one authority, B and F 
by a second, and A and G by the thirtl and fourth respectively. 

3J0. The examiners all accepted the marking-soheaie of 
CSbief Examiner of the Durham Board. 
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21. There were two papers : Paper I, a 2 hours’ paper on Essay 
and Pr4ci8, and Paper 11, a 3 hours’ paper, mainly on set hooks 
in prose and verse. The marks for the two papers were added 
and then reduced so as to correspond with a maximum of 100. 

22. The minimum range, t.e., the extreme difEerenoe between 
the marks allotted to an individual candidate, was 7, the 
maximum 31, and the average 18-5. But the diSerenoes between 
the examiners was shown most clearly by the difiEerences between 

*^e award of Failures, Passes, Credits, and Special Credits of the 
individual examiners. 


The following Table shows the numbers of awards : — 



Examiner 


FaU 

Paaa 

Oredit 

Special 

Oredit 

A 


. 

1 ' 

16 

27 

4 

B 

• • . . 

. 

0 

2 

34 

12 

G 

• « * • 

• 

7 

30 

11 

0 

D 

. 

. 

0 

9 

86 

3 

B 

. « . # 


6 

16 

27 

0 

P 

.... 

• 

2 

7 

37 

2 

O 

. 

• 

19 

12 

17 

0 


23. An inspection of the figures in greater detail shows that 
in the case of only one candidate out of the 48 were all seven 
examiners agreed as to the class in which he should be placed ; 
and there were only eight cases where six of the examiners were 
in agreement. Examiner G “ ploughed ” 10 candidates, whOe no 
other examiner “plou^ed " more than seven, and two " ploughed ” 
none fi^miner B awarded 12 “ Special Credits,” while the other 
examiners awarded very few or none. 

24. The examination is not a competitive examination, and 
therefore the order of merit is not of any ^eoial importance to 
the candidates. But the differences of opinion of the different 
eocaminem in regard to their relative merits are shown by the 
{oUowing statement : — 

The difference between, the highest and lowest position assigned 
to a eandidate is — 

80 or more in 6 oases 
20 — ^29 in 19 oases 
10 — 10 in 18 oases 
Under 10 in 6 oases 

28. The divergjaacies of the marks allotted to the two Papers 
ocmiBdmwirsapar^ were greater than those shown when the 
marks were added tt^j^ether. 
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20, Mr. Eoborts and Professor Briscoe draw attention to eertain 
extreme divergencies. On Paper I (Essay and Precis) ; 

Rang* of Harkt 

Candidate X ^ne awarded 38, 32, 40, 66, 66, 08, 80 out of 100 by 

the leyen examinora 52 

Candidate Y w«m awarded 24, 42, 48, 60, 60, 64, 70 out of 100 by 

the leven examiner* 45 

Candidate Z waa awarded 16, 36, ,38, 44, 44, 46, 60 out of 100 by 
the aoTcn examiner* 44 

On Paper I, nine candidates wore awarded a Pass by all the^ 
examiners. Of the 39 candidates who were awarded a Failure 
mark by one or mote examiners, 25 were awarded a Cmlit, 

8 SiK'Oial Credit, and 3 Distinction by one or more examiners. 
Again, two of the examiners awarded between them Distinotion 
to six oandidalos. The awards of the other examiners to these 
six oandidatee were as follows : — 

No. of Oaniidaio Avatia of Otter Sxaminert 

1 Failure ; Paas ; Credit ; 3 Special Credits. 

2 Failure ; 4 Credits ; Dbtinotion. 

3 2 Failorea ; 4 Ciedita. 

4 2 Paaaee ; 4 Credits. 

6 Paaa j 3 GrediU ; 2 Speoiat CrediU. 

6 4 Crests ; 2 Special Credits. 

27. In Paper 11 (Literature) the variations of award though 
great ace somewhat less than in Paper I. 

The macks of the candidates in regard to whom the divergenoies 
were greatest, were as follows : — 



Candidate 

Marla rereired from the settOi 
examiners {oed of 100) ^ 

Jtonge 


V 

10, 41, 46, 46, 46, 40. 68 I 

30 


Q 1 

37, 60, 82, 62, 64, 63. 71 ' 

84 


B j 

38, 30, 46, 47, 53, 66, 70 

32 


In Paper II, again, 36 of the 48 candidates were passed by all 
seven examiners.* Of the remainder 3 received a Failure mark 
from only one examiner, and 8 from 2 to 4 examiners ; but 
in aU those cases the candidates were awarded from one to thrte 
Credits by the other examiners. The nearest approach to unanimity 
was in the ease of one candidate who was ploughed by six 
examiners, but was awarded a Credit by the seventh. 

is interetting to note the general opinion of ttw examiners that the stano 
dard in Set-books was much higher than in Frdcis and Esaay. the attieie 

on English Composition at the School CertilleatB Examination By Kir Philip 
Hartog in the “Essays on Examinations'* published by the Committee. 
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Two of the examiners between them awarded Distiuctioa to 
five oandidates. The awards of the other examiners to these 
five oandidates were as follows : — 


No. of Candidate Auxtrds of Other Examinere 

1 Paaa ; 4 Credits ; Special Credit. 

2 Pass ; 4 Credits ; Special Credit. 

3 2 Peases ; 4 CSredits. 

4 6 Credits. 

6 3 Credits ; 3 Special Credits. 

V 28. The following Table shows the numbers of awards of the 
difierent examiners on Papers I and II separately ; — 


PAPER I 


Bttom. 

iiter 

ure 

Pom 

Credit 

iSfpeotal 

Credit 

Dietine- 

non 

FaiU 

ure 

Pose 

Credit 

iSpectoI 

Credit 

Dietino, 

lion 

A 

7 

16 

16 

7 

4 

2 

12 

30 



B 

7 

10 

23 

6 

3 

0 

2 

27 


4 

0 

12 

29 

7 

0 

0 


21 

17 


■■ 

P 

9 

17 

20 

2 

0 

1 

■■ 

26 

16 

Bl 

E 

8 


18 

2 

0 

e 


23 

2 

Bl 

P 

6 

21 

17 

4 

0 

2 

Kl 

36 

mm 

mM 

0 

86 

11 

2 

0 

0 

0 

mm 

18 

■9 

1 1 


PAPER 11 


It is to be remembered that Examiners B and E ordinarily 
examine for one Examining Body, and Examiners 0, D and E 
ordinarily examine for another Examining Body. 

29. We believe that the method of selection of examiners for 
our inveet^atians was suoh as to enable us to draw general 
oonoloaions from our results. The iudepeudeut iuvestigatiou 
oartied out by the Durham Universiiy Board 3 delds valuable 
support to our ocmohisious. 

Spemat Place Examinaiion (I) : ArithmeHc and EngUah 

30. Thb was the most complex of all the investigations, since 
II dbslt with two 8ub|eot8. The scripts of 160 candidates in 

Siod in English were marked by 10 examiners in each 
subject. The matfaingHBohemes were settled after ooirespondenoe 
with the exammers, each of whom marked 60 tcial-soripts in 
ecoordanioe with a draft marking-soheme before expx'essing his 
on the sohema. The marktag-sohemes were modified in 
si^ ft "WSiijr as to deal with all the points raised by the individual 
«qcfti »in e r e» and they were finally settled only after an assuranoe 
hftd been reoetTed from each of the examiners in the subjeot 
eemoemed that tbft sohjemes contained no ambiguitiea. 

31. Tbc l$0 eovi^iitw fiir the final investigation included a large 
luopmrtioin of 'the vmy best sent in for the original examination. 
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as judged by the original examining authority, A very bigh 
proportion of these scripts would therefore be scripts of suocessiM 
candidates and of those who approached success. 

32. The results were first an 8 l 5 r 8 ed in the following way : 
At the original examination the fate of a candidate would primarily 
depend on the marking of a couple of examiners, one for English 
and one for Arithmetic. Of the exammets actually employed, 
couples were chosen at random and designated A, B, C, D, etc. 
As an example of the differences of marks of these couples, 
we may choose Candidate No. 1, who received from the 10 couples 
of examiners the following marks out of a maximum of 200 : 
106, 107, 100, no, 119, 124, 124, 130, 130, and 130, the range 
being 34 marks. The average range for all the oandidatee was 
33 marks, the smallest range 12, and the highest 63. This range 
must bo regarded as considerable in view of tho faot that ti^ 
examinations were of an elementary character, that the eucaminets 
were experienced in this type of work, and that they were mArlring 
aocording to carefully drawn^np marking-schemes. 

33. In the type of examination where there ore many asmatant- 
examiners the Chief Examiner criticises tiieir marks, and makes 
adjustments for diSerent standards of marking. The distributions 
of marks are also sometimes reduced to a standard, but no soeb 
adjustment would alter the order of the candidates in the batch 
assigned to a single assistant-examiner. At a competitive 
examination of this kind the absolute mark does not matter, 
as it does in the case of a School Certificate examinarion. It is 
only the order that matters, and we must therefore consider this 
point. 

84. The following are the most important results with regard 
to the first 60 candidates : — 

33 candidates are returned in the first 50 by all 10 couples 
8 oandidates are returned in the first 50 by 0 oouples 
4 candidates are returned in tho first 60 by 8 oouples 

4 oandidates are returned in the first 50 by 7 oonplee 
1 candidate is returned in the first 50 by 6 oouples 
1 candidate is returned in the first 60 by 4 couples 

5 oandidatee are returned in the first 60 by 3 eouplee 
7 oandidates ate returned in the first 60 by 2 couples 

12 oandidates ate returned in the first 50 by only 1 qptt^ 

78 
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Thus 33 candidates 'vvould. get into the first fifty places 
whichever couple of examiners marked their scripts ; but 
the fate of the other oaudidates for the next 17 places would 
depend on the chance of being assigned to particular couples, 
the chance of success being greater for some candidates than 
for others. 

36. There is much less agreement with regard to the lowest 
third of the whole group, so that the element of ohanoe in the 
o>award of special places on the plan adopted is very consider- 
able. 

86. We now consider Arithmetio and EngLish separately, taking 
first Arithmetic. 

Out of the 160 candidates in Arithmetio there are 63 who got 
80 or more marks from at least one examiner, end of these 18 got 
80 or more from all examiners. Supposing we regard 80 as a high 
mark intended to indicate scholarship level, we find complete 
agreement among the examiners in regard to only 18 out of the 
63 possible. 

87. The Arithmetic Paper was divided into two parts. Part A 
and Part B. Part A consisted entirely of twenty straightforward 
oaloulations. The variations in dealing with this part were 
very Kmiill, and mainly due to the illegibility of the writing of 
certain candidates. The average range, i.e., difference between 
the hipest and lowest marks, was only 2-1 per cent, of the 
maximum, whereas the average range for the two Parts was 
14^ per cent. 

38. Ih spite of the elaborate precautions taken in the marking- 
sdbeme, tiim were very great differences between the examiners 
in deali^ with Part B, which included problems. In a question 
at which the maximum was 16 marks, one candidate received 
16 from one examiner, 12 from three examiners, 8 from two, 
7 fjrom two, and 4 from two examiners. But for other oaudidates 
tikore was greater agreement. For 20 candidates the marks were 
axaa% tire same, for 33 the marks only differed by 3 or 4 out 
of the 16 maximum. 

38. The Englkh Paper consisted of two Parts, A and B, of 
which A was an Essay Paper. A detailed scheme was used for 
msrikh^ the Enag^, marks being awarded for the following seven 
separate slpments Tocabulaxy, Aoouraoy, Craftsmanship, Cou- 
sistettc^, OompUimmM Substance and Quality. The maximum 
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for each element -was 7 marks.^ In respect of Vocabulary, only 
one-third of the candidates got the same mark from as many as 
five out of the ten examiners. 

40. The averages for the different examiners varied considerably. 
The variation of the averages of the different examiners for the 
several elements is shown in the following Table : — 


I Maximum «>. 7 mark* Jot 
tafhdtnuni 




IlighMt 



Vorsbtdacy • 


503 

3-09 1 

2-84 

Aaouraoy 


S-36 

3-OS ' 

2'31 

Cnftimsnabip 


4-69 

3-20 

1-49 

Comtiitoaoy * 

. 

8>93 

2-00 

2-93 

CompleteneM ■ 

- 

6-41 

3-11 

9-30 

Snlwtsiice 

• 

6C1 

3-16 j 

3-3S 

QttsUtf . - • 

• 

4-52 

3-06 , 

t 

1-47 


41. The mean deviationa of marks also differed considerably, 
and the average of the mean deviation of examiners varied from 
element to element as shown below : — 


foeatulwy 1 

{Acesraej/ j 

1 Gntfitmamhip \ 

CamUtmey 

CompMemt 

SuMom* 

Quaiilif 

1-23 

1-26 

1 

W9 

1-17 

1-22 

1>33 


42. The paper on English, Fart B, dealt mainly with the sense 
of passages, the sense of phrase, and the sense of single words. 
Except with regard to one question, for which 60 candidates 
received the same mark from cdl the 10 examiners, the agreement 
was small. 

An elaborate analysis has been made of the marks awarded for 
parts of a question, which cannot easily be summarised here. 

Some examiners marked consistently higher, some consistently 
lower, than the majority ; others marked sometimes lugh, 
sometimos tow, and it is obvious that an examiner who does this 
will alter the order of the candidates oonsidetabiy from the order 
of the majority. 

iThe investigstois employed this scheme because ft had been used fa 
■fanilsy examinations, bat are not in any way committed to Htfrisw that 

is satisfaetoty. 
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Special Place Examinaiion (II) : English Essay, 

48. The qijeetiott-paper gave a choice of one out of four 
subjecte, and the time allowed for the work was 30 minutes. 

The main object of the investigation was to compare the resxilts 
of mATkiwg when such essays are marked on impression only, 
wilh the results when they are marked in accordance with a 
detailed marking scheme. 

44. TyptA cellos were made of 15 trial scripts, and circulated 
to the ten examiners oonoemed, together with a draft detailed 
ffl ft T Vitig acheme. The marking scheme was then amended to 
meet the CRittoisms of the examiners, and answers to all doubtful 
pdnts were furnished to them. 

45. copies were then made of 160 other scripts, of which 
the marks oti^nahy allotted to them by the examining authority 
ahowed that ^ey varied in marking from very poor to very good. 
Ihicih examiner reomved not only a i^ed copy of each of the 
eaeaya, on which It was possible for him to insert marks, but also the 
BGO^ffcaeiftwMoh he could mark for handwriting. The following 
Mae the mort imporbemt uwtructions issued to the examiners : — 

Sodpte 1-76 are to be marked by impression only. It is of 
the eseenoe el the investigatioa that, in marking these scripts, no 
eitmnpt ahooid he made by the examiner to conform to the 
of marking set out imder (iii) below, or to any scheme 
ai the Mod. Bxamnera an pariioularly requested to marh scripts 
1-75 Hkey mart ecripk 76-160. 

01) Ssti^ 76-160 are to he marked according to the amended 

(1^ Tito masfmtun mark for all scripts is 100. The examiners 
kMk whh the amended marking scheme from which the 

ItoitMidflf 

Itodbi ill* he albt^ 

0) qtudity and. control of ideas ^ >60 marks 

(^Toealmitty 16 marks 

P) IhMnnirf^ .... 16 marks 

ttewthM* at Bwrt e n iiaes. ..... lo marks 

P) limttlni 6 marks 

^IbW - - 100 marks 

IlfKMiitiv te kerti wiiiitfmf tiie soniita of Set 1, comprising 
]{sai» tad ttwa* <it Brii 2, ooir^ttiidng ISTos. 76-160, were 
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approximately equivalent, the sets wero re-shuffled and re- 
numbered, and, were then marked by three examiners, X, Y and 
Z, other than those who took part in the examination of the 
final scripts. X, Y and Z were all members of the same panel of 
examiners for a Special Place examination, though not for this 
particular one. Subsequently the marks Plotted by X, Y, and Z 
were re-grouped according to the original numbers, 1-7 {j, 76-160, 
and it was found that the average mark allotted by each examiner 
was the same for Sot 1 as for Set 2, although this average differed 
from examiner to oxaminor. It was also found that the distribu- 
tion of the marks of each of the examiners was approximately 
the same for Set I as for Set 2. 

It would therefore appear that any difference in the main 
investigation between the markings of the two Sets by an individual 
examiner must be due to the difference of method employed, and 
not to a difference between the two Sets. 

47. The first and moat striking results of the main investigation 
ore given below : — 

Avmaom 'Mabkb AvraEtsKD bt ros BuHonoiB 






JEicminen^ 




DiJBmmua 

MgMitA 

laeut 

ouengv 

1 

A 

B 

0 1 

1 E 

1 ' 
! ^ ' 

K 

i L 

- - 1 

P 

Setl— 
(Imprenion 
hburldng) - 

40-0 

43'7 

69<4 

iai-s 

1 j 

1 

44*8 ! 

47'6 

61-2 1 

M j N 
4U-0 j 48-2 

41-7 

27-6 

Seta- 

(SeMied 

flO-a 

M-O 

02-3 

1 

1 

j B8-8 ^ 

> 

1 

eS'S ! 49-3 

638 

( 

1 

so-s 1 m-o 

1 . . 

e4-s 

130 

Pifiuviwe • 

n>6 


2-9 

j27'0 

1 '3'® 

f 

1-8 

2-3 

10-6 j 9-S 

12'8 



Thus in every case the average mark awarded to Set 2 for 
scripts marked by details was greater than the average of marks 
awarded to Set I for scripts, marked by impeessbn. 


*£xAmuiera A, B, C, E, G and K are the examinen in BngUih who mn 
designated by those letters in the prexioue iavest^atioa on the Speeiai Ptaee 
examiaatlon. L, H, K and P are examinete who did not take part in the 
paevlons investigatioii, bat, like the other examiueni, jhey am ati 
eeepetienoed in examining of this kind. 
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yfiiik Examiner E the difierenoe between the averages is 
27 tnarlta ; with Examiners A, B, G, M, P, the difference is about 
10 marks* and with only 3 examiners is the difference small. 
Thus the marlting by details produces higher marks on the average 
t han the Tnarking by impression. It is also noteworthy that the 
averages of the several examiners are closer together when the 
Ttinrlrfag is made by detah than when it is made by impression 
only. The mean deviation of the averages of the impression 
marks is 6*2, and that of the averages of the detailed marking 
only 3*4. Again, the average range of marks was 36*6 for the 
m<M*Mng by impression, and 28*9 for the marking by the detailed 
soheme. The analysis shows that tho marking by means of a 
detaOed soheme yields on the whole closer results from the 
diffscent examiners than the marking by impression. 

48. Marlrftig by impression shows very great differmioes between 
the examinen. The greatest difference was shown in the marks 
of a otkndidate who received the following marks : — 16, 50, 63, 
68, 16, 78, 62, 75, 48, 71, 64, showing a range of 63. The lowest 
r«ti0a Tttm 13, and the average was 36*6. In the marking by 
dsteik the hipest range was 52, in the case of a candidate who 
teoeired marfa varyitig from 26 to 78, and the lowest range 
WM 14*6. The avraaj^ range was 28*9. 

A d^aQed analysis of the figures showed that the greater 
iwngesi yielded by the marking by impiesdon are not due to 
a figore for random marking, but to a greater difference 
hvNM tite standards adopted by the different examiners. The 
asMitjndi shows that the element of random marking has roughly 
iigpMddHig tiMS sanm 

48. !!|^ last |Kiat is Important. It means that the use of a 
, dMtVsd M tw toa f mAtem oonduce to a closer approximation 
dC 41# mndSirda of examhiss, bat that it does nothing to reduce 


4Hh. iBmtm betswen the different examiners is very 
|pM|il,i. Ill 4hs tnsrkiBg by impression Examiner E awards 
Wlliurtli << hwatjasB 40, ISxamiiuir 0 only 2. On the other 
tlrtoark* of 72 or mote, and Examiner E 
Ifbti ttHM, 1ft Ihft BMwkfag by details Examiner M gives 
fl iwiitodf hill thsij dfr, sand Examiner 0 gives none. Examiner B 
il pula «t 7ft oe iaai», and Eramlnw M gives only 6. 
inft miif twtt i ii ttia d i iwtt i ; whose nurrks show approximately 
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Buoilar distributions, and ^11086 averages are approadmately the 
same whon marking by the two different methods. 

dl. On the other hand, the averages of the two standard 
deviations for the two methods of marking are the same — ^in other 
words, the method of marking by impression and the method of 
marking by details produce, on the average, the same degree 
of discrimination between the different oandidates^the same 
qcread of the marks. 

62, Although this is true of the averages, some examiners show 
a different standard deviation in their marks by the two methods. 

63. The average ranges for the different elements are shown 
below. 



Thus the average diSerence between the extreme marks awarded 
is a high percentage of the maximum mark in each case. 

64. It will be seen that the greatest average range ooours in 
granmuur, and the least in bandwriting. There ore quite large 
numbers of candidates for whom the ranges of marks are as great 
as half the maximum in respect of all the elements of the test 


except handwriting. 

66. The number of oases (out of the total of 76) in which six 
or more of the examiners agree are as foUows Ideas, 32 ; 
Vocabulary, 44; Grammar, 20; Structure, 28; Spelling, 48; 
Bjgmdwritiag, 63. 

66. It has been seen that examiners give hi^er marks wlum 
tnitrlriTig by detaiis than when marking by impresrion. An 
attempt was made to discover how the examiners distributed 
among the various categories of oandidstes the excem of marls 
resulting from the second method of markiag. It showed that 
they tended on the whole to favour scripts that were " average '* 
to "just above the average,'* and to undermark the o^er 
eat^;ories, especially the "very good"; but the diBecq^oes were 
small. 









30 AH' BXmSTATIOir 07 SXAJUIKATIOirS 

OoUege UfOrcmoe ScMaraUp Examination : English Essay 

67. Hie Paper, which was set at an Entrance Scholarship 
Aytttrtiwfttlon for a group of colleges in a University, gave 
a lihoioe of four subjects for the essay, but no further instructions. 
Hie time allowed was 3 hours. 

68. Pifty scripts were sheeted from a larger number, so as 
to Include those of five holders of soholarships or eshibitions. 
Hiey comprised the scripts of all the 10 candidates who had 
selected the first subject ; of all the 8 who had selected the second ; 
at all the 11 who had selected the third, and of 21 who had selected 
the fourth. 

69. The examiners were asked to assign numerical marks with 
* maximum of 100, and also to assign a class to each candidate 
in aoootdanoe with the following scheme : — 

Class I 67 marks and over. 

Class n 60 marks to 66 marks. 

Glass m 33 marks to 49 marks. 

dass IV Under 33 marks. 

60. The numerioal marks varied considerably. The range of 
tha maachs allotted to candidates varied from 7 to 36, and the 
■wags xat^ is 19’6 per cent. The extreme oases ace ^own 
helow $ — 




SnamiMri 


Bangs 









m 

H 


D 

m 



■ 

Q 



■ 

1 


H 

B 

Q 

Q 

H 

8S 

Bk. t 

fl 

B 



iB 

as 

IHHhiI 

D 

B 

1 ^ 


1 w 

8« 


d. the aMttis awarded by the different examiners, 
mik$ twit w«ns clew together. They are as follows : 
A, «!•# t mj B, 84*0 j Jl, 60-6. 

Hi# Tiihlt show* the statiskoel distrihutdon of 
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A 
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B 
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16 1 

1 

D 



12 ' 

2 

E 

Hi 

HI 

U I 

1 
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62 . The following Table shows the awards of all the examiners 
to the 26 oaudidates who were allotted either a lElrst Class or 
Fourth Class by any examiner : — 
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The oandidatea whose numbers are marked with an asterisk 
"were placed in three difierent classes by different examiners. 
Perhaps the most striking instance of discrepancy is that of 
Candidate No. 26, who is given a Krst by Examiner E, but only 
a Fourth by Examiner B, ^though B is more generous with Firsts 
than any other examiner. 

$3, It is e(^oiaUy interesting to see the different selections of 
candidates by the different examiners for a First Class. 


immiintr A 

B 

C 

J> 

E 


Oaniidaiu 


Firtl Olats 


Noa a 

8 

8 

22 

8 

10 

8 

16 

40 

21 

18 

11 

26 

— 

22 

16 

32 

40 

— 

26 

17 

36 ' 

46 

— 

36 

as 

87 


— 

— 

86 

41 

— 

— 

— 


44 





It wffi be seen that not a sin^ candidate out of the seventeen 
WM placed in the First Class by more than three out of the five 
examinets. Three candidates each received three votes ; four 
eandidaitos each reoeived two votes, and the other ten had only 
ttm vole ewffk thus the consensus of opinion in the oases that 
‘mi&S' wMtit it extraordinarily smah. 

^ His notewiEntby that though there is comparatively little 
( itthr e iBO e between tlw averages oi the different examiners, the 
ordht In wUth plaoe tlra candidates differs greatly. It is 
fiMtln aaeisai^^ of this kind the marks obtained 

Ijf ik m ti SSAi U tk am to a very great extent a matter of ohance, 
Ctt tfaa partfcalar examiner by whom the essay 


^ K wb mU tu Molkmalieal S&nours 

Btpv minimi U questions, four relating to 
dlllwwiniil eqinalfaMi, «»dlij|^«JatingtQ analyth^ geometry of 
Him d tmwwte m Q ai d tdtiet were informed that they might 
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attempt any number of questions, but that full marks might be 
obtained on about six. ^^ee hours wore allowed. 

86. Twenty-three scripts were marked independently by six 
examiners, A, B, 0, D, E and E. The scripts were then in- 
dependently revised by the pairs of oxammora AB, CD, EP. 
There were thus produced six sets of original marks and three 
sets of revised marks. The nine sets of marks are printed below 


Maximum Mari 300 


Sxam-^ 
4ntr 1 

1 

A ' 

i 

B 

1 1 

C 

1 

I) 

n 1 

1 

1 

V 

Sangr 

(A.B)|(C.D) (R.EllBlWjw 

Ocmii 
iaU 1 
1 

1 

309 

186 

1 

223 

236 

226 

212 ^ 

^ 60 

1 108 

230 

2ID 

32 

* 1 

200 

205 

180 , 

103 

,206 

208 

28 

203 

133 

, 207 

24 

8 1 

201 1 

208 

172 ' 

108 

197 ' 

179 

36 , 

, 203 

186 

190 

IT 

4 1 176 

103 

1 172 

[ 177 

212 

189 

40 1 

186 

177 

< 210 

33 

6 

81 

94 

1 81 

100 

123 1 

145 

64 ‘ 

86 

96 

128 

42 

6 

200 

217 

1 203 

' 206 

207 

187 ! 

30 

207 

208 

106 

n 

7 

110 

140 

1 137 

167 

134 

150 

1 38 ^ 

1 126 

145 

la I 

1 20 

8 

167 1 

201 

, 187 

198 

190 

ISO 

1 34 

188 

194 

190 

1 8 

9 

147 

166 

' 127 

130 

140 

147 

28 

161 

138 

144 

13 

10 

203 

220 

203 

1 192 

205 I 

208 

28 

216 

203 

1 207 

» 

11 ' 

86 

66 

79 

78 

lOS ' 

65 


76 

87 

88 

1 IS 

12 1 

133 

122 

1 140 

128 

127 

133 

' 18 

128 

137 

130 

9 


224 

228 

230 

263 

222 

241 

31 

220 

246 

239 

36 

, 

14 

215 

226 

328 

, 223 

1 234 

217 

19 

220 

228 

226 

6 

16 ' 

224 i 

1 246 

265 

282 

216 

246 

46 

1 239 

280 

241 

SI 

IS 1 

96 

120 

136 

143 

136 1 

i m 

1 ^ 

117 1 

136 

131 

19 

17 1 

166 

161 

171 

168 

178 

m 

1 17 

163 

171 

178 

16 

18 

287 

294 

290 

303 

300 

303 

1 21 

290 

300 

302 

IS 

10 

i 123 

101 

66 

100 

1 114 

102 

67 

113 

01 

lOS 

S2 

30 

1 164 

126 

118 

122 

163 

178 

1 67 

132 

m 

169 

46 

SI 

117 

1 102 

120 

131 

136 

113 

34 

110 

120 

123 1 

IS 

23 

SO 

73 

76 

1 81 

76 

87 

16 

79 

83 

81 

4 

S3 

271 

278 

277 

287 

273 

282 

16 

270 

283 

278 ] 

8 

Arat» 

•SO 

1889 

1721 

1 

108 7 

177 3 

179-1 

1 177 6 

W7 

170 8 

1749 

179 3 

las 

Mean 

Dati. 

•ttoaw* 

48 


68 

1 


46 

1 

( 63 

& 

48 

1 ■” 

1 


^The mean deviation of a seriec of nnmbers is the avenige tA theie 
diffatenoes from their average. 
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67, It -vrill be eeeoi that the maximum. didSerence of the averages 
•x^filALiLe iodividoal examiners is about 11 marks — ^just imder 4 per 
cent, of the maximum mark. The maximum difference of the 
averages of the three pairs of examiners is 8*6 marks. 

68, It IB interesting to note that the q>read of marks, as 
measnred by the mean deviation, is roughly the same in the case 
of each examiner and of each pair. There is thus no evidence 
here that vrhen pairs of examiners allot marks they necessarily 
award marks with a smaller spread than when they act 
individually. 

68. The differences td the averages yield very little indication 
of ^ differences of the marks allotted to individual oandidates. 
The six indexwndent markings of Examiners A to E yield ranges 
dt wMah the lowest is 17 and the highest 64, -with an average of 
84*7 on a maximum of 300. 

70. The procedure of settiing marks on the verdiot of two 
eacaadnars, thou^di H affects the averages very little, had a much 
gprsater effect in redtuong the ranges, of which the extremes 
were 8 and 46, and the average 18*3. The fact that in an examina- 
tion of this kind two out of three pairs of examiners can differ 
by an nmob as they do in the case of Candidate No. 20, who was 
aasj^gned 133, 128 and 169 marks, or of Candidate No. 4, who was 
aM%tmd 186, 177 and 210 marks, is remarkable. 

71. It should be noted that the examineis agree in their 
piacfapf: of tba first two candidates at the top of the group andiu 

the 13tl> in order of merit. They do not agree in the 
(iMinti of the otiier 20. On tire other hand, it is noteworthy 
tlMAIwsiMdsing of thesiaunfoers notably diminidied the difference 
it ‘tit* ti^br tn whiidr tit* oandidates are placed . 
f%. Tlw f^lowlog iMtenoes of the diffe^oe of opinion between 
snsaiBioecs at* strikiug Candidate No. 1, whose 
jlMtatwIitiilftittiMiQdlivldnalexaminsts (Examiner E) 

In IMIi (IhcMstaMi!' B) <if tit* 28 oaadidateB, is |daoed lOtii by the 
liilt JJ* iMdea fitb by the pair CD (marks 230) and 6th by 
ihtiaiit XP (natles iU3). Oaadid at * No. 4 is placed 12th by the 
wirABiMwlMXbaliphwed ISth by the pair CD (marks 177), 
Iwl li flMtl m (BMitia 31^ by the pair EE. The pair of 
AE iMiid til* |Mi|r W titsucA Candidate No. 1 and 
jNbi. 4 a* iwl betog; wary dlffstent in merit, tmtupared 
t* aaedl tfia* (tiiMtilr tiny pa* titsm in vary different places 
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among their co-exominees) ; while the pair of examinere OD 
regard them as differing widdy in merit. 

University Eistary Eonmrs 

73. The examination papers were four in number, on the subjects 
shown below : — 

Paper I Ancient and Mediaeval History. 

Paper II Mediaeval and Modem History. 

Paper III An Essay Paper with a choice from a number oi 
subjects. 

Paper IV Political Thought (prescribed books). 

Instead of numerical marking, a scheme of literal marking was 
adopted in accordance with the praotioe of most History exaomina> 
tlons in this country. Owing to this fact the section on this 
subject does not lend itself to condensation and is thenyfore 
given in full in Appendix I, pp. 69-77, below. 

Viva Voce {Intervim) Examination 

74. The viva voce examination, not on a *' subjeot,” but of a 
general dbaraoter to test "alertness, infdligence, and geneoeal 
outlook" is an important element not only in Civil Serrioe 
examinations but at interviews for the selection of candidates for 
public and private appointments generally. 

It appeared, therefore, desiraUe to t^ the degree o£ ooo- 
ounence of two Hoards of Examineis appointed to oondoot an 
examination of this Mad. 

76. In order to seoure a satisfactory basis for snob an investiga^ 
tion, it was necessary to get together a suitable team of oandklates. 

The following conditions seemed desirable 

(i) that the candidates should be approrilmstdiy of the same 
age and have received the same kind of training ; 

(it) that tibe candidates should be provided wi& an adetiuate 
sthnulus, not only to secure their presence but to make reaaonabfy 
sure that they would treat the examination vdth the kSnd of 
seriousnees that is to be expeotf<(d of eandidates omnpel^K fm 
an appointment ; 

(Qi) ^at the exantineis should be pnrridied with a, aoltalda 
exited by wbioh the oandidateft were to be judged 
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(Iv) that the examiners should be persons of expetienoe, 
« 55 d’to judging candidates by interview or viva voce examin- 
ations. 

76. It was decided to offer a prize of £100 on the results of a 
wm voce examination of this hind. The examination was limited 
to students who wore studying, or who had recently studied, at a 
imiversity, and were certified by the university authorities to be 
iultabie, in their judgment, as candidates for the Junior Grade 
of the administrative class. Home Civil Service [this is the 
technical name for the appointments of the highest grade in the 
Home (fivii Service, open to competition] ; and the candidates 
were required to be within the age-limits prescribed for 
that examination for the year 1834 (21 to 28 on August 1, 
1833). 

77. The scope of the examination was defined, as in Civil 
Service regulations (see para. 81 (d) below). 

78. Thirty candi^tes applied, and of these 16 — 12 men and 
4 wonmo— with excellent University records, were selected for 
tiu> purpose of the examination. They had received their training 
in (me or more of the following UniversitieB and Colleges : — 
OxfcHcd, Cambridge, london, Bristol, Glasgow, University College, 

Uxdveisity College, Southampton. Each candi- 
date fiQed m a form samilar to that required by the Civil Service 
examining authorities, to which was attached a confidential report 
thm a tutor or other univenity authority and a report by the 
SMkdSdats himself ora his Bfe and education. Cc^ies of these 
dtoMnMWtswsmfaRiiabedto examiners. 

TUv Ifro Boards wets canstituted from the following persons : — 
fmmmm lunne Babxxb, Professor of Politioal Science, 
Cbnlnidigs, lormerly Principal of King’s College, 

Imubv Vwrnt 

flnkltujKX Bhrsotr, K.BJI., PHH., late Astronomer Hoyal. 

MMi. Maitt Assm HAKaanm, formerly MJ?. for Blaohbum. 
Mm HL IHA., Warden of Ki^’s Ckfiiege of Household 

mmS ** — 

MMil fiMiPI CXEMmOvIii 

fin EiWltf I ha M M #, OJB.» formerly Senior Chief Iru^otor, 

Mira (Hr JMnMIlKilL 

a i. fiMM, Btarthotiffs Profossw 
IdiMMn fo Ms tlolvMity Ql L 
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Mb, L. B, Tnenbe, Fellow of King’s Oollege and University 
Lectnxer in Engineering, Cambridge. 

Db. W. W. Vatohan, late Headmaster of Rugby, 

80. It was originally intended that ihe two Boards should 
have the same number of members, but one of the prospective 
examiners, the Head of an important college, was accidentally 
prevented from attending on &e morning of the examination, 
and could not be replaced at the last moment. The examination 
was held on 27 March, 1934. 

81. The following are the more important instructions given 
to the examiners : — 

(а) There will bo two Boards of Examineie^Boord I and Board 11. 
The tist buBiness of each Board will be to elect their chiimtas, and 
to diiottsa any details of procedure other than thoee provided for in 
the scheme set out below. 

(б) There will be sixteen candidates. These will be divided into two 
gronps, Crionp A and dronp B. Candidates in Group A will appear 
in aipbabetieal order first before Board I and then I^oie Board II. 
Candidates in Group B will appear in alphabetical order first before 
Board II and then before Board L 

(e) Bach candidate is to be examined /or not ht$ thm a qwrtw of cm ftour 
and not more than half an hour. 

(d) Fartionlars of each candidate, extracted from his* applioation, will 
be available for each examiner. The original applioation will be in 
the hands of the Chairman. The following is to be taken as the 
general direction with regard to the method of the viva veeo 
examination. 

The examination will be in matters of general intereet, not in 
matters of academio interest ; it is intended to test the candidate’s 
alertness, intelligence, and intellectual outlook. Baeh eandidato 
has fomiBhed a record of his life and education. On the interview 
and record the examiners will judge the value of the candidate’s 
personalia for the Home Civil Service, 

The maximum mark is 300. 

(s) The following procedure will be adopted with regard to the recording 
of marks : 

As soon os the «*<» voee exasnination of a candidate ie over, 
and before any dieouetion of Me meritt hae taken pknet, the Cb a ian an 
will aek each of the examiners to write down his mark on the mark' 
sheet and he will also write down his own mark onhbi ownmark- i i h e et . 
The Chairman will then oek the other examinese to state the marka 

*Th6 oaadldates and the Boards of Examiners wlli inebide wtmustt as 
weQ as men ; the masculine gender is used with relueaet to gytd l d at se 
and examinen for the sake of simplicity. 
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ao indltteia dovn and idU finally atate Ms own mark so that each 
' namhev of the Board may know what marks have keen allotted in 
the first Instanee by the several members of the Board and be able to 
xeoord tibism on his mark-sheet ; a discnssion will then take place 
on the (Uffearent marks proposed and the Chairman will record a mark 
representittg the view of the Board as a whole, this mark being 
obtMaed <^er by agreement or, if that is impraotioable, by taking 
on average of the marks allottod by the several examiners, 
V.B.—The Chairman of each Board is roiiueBted to see that 
Uu ab<m ammjfmtnt ia ttriaUij oimeed, as it is regarded as an 
essential featore of the Examination. 

</) Stdtable mark-sheets will be provided. 

(g) Sxaminen are requested to sign their mark-sheets and give them 
in to the Chairman of the Board. 

83, At the end of the day, each Board oorefully reviewed ita 
ntadks, in order that the members might be sure that the marks 
allotted traiudated oorreotly their impressioDS of the relative 
shflitiM of the oandMates. 

Mi, The maidcB awarded ore set out below : — 
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84. The order in whioh the candidates were placed is shovm 
below : — 


] 

Camlidale | 

Board I 

jVorht 

1 Boardll 1 
ilarkt 

i Board! 
Ortkr 1 

[ Board II 
Onkr 

1 

1 i 

1 j 

1 120 ' 

1 

212 

1 16i 

11 

Ji i 

200 1 

i lUO 

1 

13 

a i 

130 1 

i 1 

U 

151 

4 j 

•m 1 

1 200 ! 

! *4 

2 

r* 

210 1 

1 232 1 

i ' 

7* 
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180 1 

1 SBO i 

12 1 

7 


1 270 1 

1 “ 1 

1 

K 

1 2>i0 1 

1 224 i 

' 2 

W 

u 

! 230 

220 1 

1 

10 

10 

210 i 

235 { 

1 8i 

0 

11 

' 210 1 

238 

8} 

0 

12 

1 230 

232 

*4 

n 

13 

I 120 

177 1 

I5i 

14 

14 

, 210 1 

247 1 

8i 

4 

10 

1 220 

i m 

0 

12 

10 

170 

1 175 

13 

1 IBl 

1 


86. The orders of merit of the two Boards ate very difieient. 
The candidate placed first by Board I is placed thirteenth by 
Board 11, and the candidate placed first by Board II is placed 
eleventh by Board 1. 

The piize was awarded to Candidate No. 4, who was placed 
second by Board II and bracketed fourth by Board I. 

86. There were no oases of complete agreement ; the eloMSt 
were the oases of Candidates Nos. 9, 12, and 16 with 10, 2, and 
5 marks difference tcspeotively. On the other hand there were 
extreme oases of disagreement, Candidates Nos. 1, 2, 6, and 7 with 
92, 70, 70, and 70 marks difference. The average difference is 37 
marks. These extreme differences between the two Boards* 
estimates of the candidates* merits, amounting to 20 to 30 marks 
out of 100, and the average difference of about 12 marks out ol 
100, point to the nnreUabiUty of the interview test, and indioate 
the great infiuenoe that this test might have in the final phtoing 
of a candidate in a Civil Service examination, 

87. The ooefilioient of correlation between the marks of the two 
Boards is 0*41. This is comparatively smalt, and in view ol the 

*l!h« S osndidatM bnMketed m squal sf Us tb« ttnrii two osoduhtM ha'ra MAtlwl 

*• ** foactii ” In ontor of merit, in aooonisnGe with the wniil pnetke in wethitlae l tehtw. 
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notaber of candidates involved cannot be considered “ significant ” 
•iftHliie turaal sonae. We must remember that the marks awarded 
are determined by two factors, the candidates and the Boards, 
and we must conclude that the different influences of the two 
Boards have been sufficient in this case almost to mask the 
common infiuenco of the same set of candidates. 

8S. It is probable that the different questions asked of the 
candidates leading to the difiorent subjects discussed at the two 
interviews affect the marks finally awarded to the candidates. 
That the circumstances of the two interviews wore entirely 
different is apparent when we look at the individual assessments 
of the examiners. 

89. In the cases of candidates numbered 13, 3, 1, 2, and 7, the 
two Boards* marks are entirely different, there is no overlapping. 
The members of each Board were in agreement within different 
bmite aa to the merits of these candidates, and in the case of 
Candidate No. 1, for instance, Idle limits are absolutely separated. 
Board I asaessed the merits of this candidate at 120, the individual 
ezaminen having awarded marks between 100 and 160 ; Board U 
asseesed the candidate at 212, the individuals having given marks 
between 190 and 240. 

90. These results show definitely that the evidence on which the 
earaminere could jud^ the canffidate was different in the two 
cases, titat is, that the two interviews were so differently conducted 
that we miibt almost suppose different candidates to have bemi 
examhied. In one resp^ there is a clear divergence between 
tha multa of the two Boards, since tbe average mark of Board X 
ia 198. and tha average mark of Board 11 is 220. The second 
Board on the whole ^v« higher assessments to the candidates. 

91. Another striking osee Is that of Candidate No. 2. Board I 
gave Mm MO marks, after very close agreement amongst the 
amuttbwm at to hh merits ; Board II gave him 100 marks, the 
fodfviduid examiuen' assesaments ranging from 140 to 210. 

9t. Tl*lndividitat«xamin«r8’8s»e{«mentsBhowveryolo8eagree> 
8WiMi» ewtldncaaea, Boturd 1 agreeing within 10 marks in the case 
Of Ckmlldl4»K«K.9, within 30 marks in tbe case of No. 3, Board II 
irNhitt m marlm la the csee of Carulidatos Nos. 3, 4, 11, 13, IS. 
of lha marks ue widely different. The different 
f gare to Candidate No. 16 100, 160, 180, 180, 
4Mi| ti9 iMiim } tlMigr gave to Candidate No. 4 170, 210, 220, 240 



AN BXAMIKATIOX OV BXAMDfATIOlSra 


41 


and 280 marks ; the examiners of Board IT gave to Candidate 
Ko. 9 166, 220, 260, and 270 marks, • 

93. The average range of marks allotted by the various 
examiners to the several candidates was 51 in tlm case of Board II, 
and 03 in the ease of Board I : but if we leave out of account the 
marks of Examiner C, which were consistently out of agreement 
with those of the rest of Board I, tho average range for this 
Board is exactly tho saino as for Board II, namely 51. 

04. This agreement can be apjireoiated by means of tho 
coefficient of comdation l)etwepn tho marks of tho individual 
examiners and the final award of tho whole Board. These are ait 
significant when tested in the usual manner. 
Cofr^iimeoiffeUnU{ffth»maTk»afindkidMlne(mintMwtkOiefimUmmh<}t At Board. 

BOARD I 

ABODE 
91 -eo <63 >80 -84 

BOAXD II 
F G H I 

•74 -88 .82 "72 

96. We find that the evidence shows that each examiner 
on a Board was able to award a mark which was a fair 
reflection, in moat cases, of the evidence placed before the Boanl, 
and therefore to agree with his colleagues as to the right mark. 
As pointed out above, the evidence placed before the two 
Boards was materially different, owing to the Inherent nature of 
an interview of this kind.^ 

>1 think that my impresaionB a« an impartial and silent obaervar of tbs 
proceedingR of the two noarda (having also had experieneo in aerviug vt an 
examiner at such viva voce examinations) may be of intiirest. The mode 
of approach of the two Boards seemed to mo to be identieal. They both 
appeared to me to aueeoed in seruring the eonfldenre of the candidatiw by 
taetiul (]uostioning and converaation rarried on in nearly all eiutes aa 
between equals. The candidates apoke with freedom and frankneaa. 
It was, of eourao, impossible fur me to hear all the randidates exommed 
by both Boards. Hut I heard the two examinations of amne of the 
oandidatea in regard to whom the diSerencea of opinion were most atnkmg 
I oame to the roneiusion that, while the two Boards were equailv skilful 
in erosa-examiniag in such a way as to reveal the weokntMbes at candidates, 
it was largely a matter of chance whether they etrucli on a topic m whuh 
a eandidate felt so strongly that he wu able to display his iudividuahty. 
It would be impossible tor me to quote the aetn^ facts on which this 
opinion is based without revealing the personalities of the .^andidaHo 
ooneemod. — P.J.U. 
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-•PARf II— lilFITBEEWCEa OF yiANDARD AND RANDOM VaMAIIONS 
OP Dippebent Examiners 

98. Ill Part I of the inveatigation the marks allooated to the 
work of a mmber of candidates by a number of examiners at 
iUSeient kinds of examinations hare been presented and analysed 
up to a certain point. 

Before proceeding further with the analysis, it is desirable to 
coodder briefly the processes by which the marks are obtained, 

97. Eor tlda purpose it will bo convenient to use the phrase “ a 
unit pieco of work *' to mean any written answer or script which 
Is aoeorded a mark independently of any other mark accorded to 
any other pieco of work. Thus the phrase may refer to a whole 
Bt^gUsh essay if the essay is marked purely by impression ; or it 
may refer to an answer to a simple arithmetical oomputation 

forms part of a larger question *, or it may refer to an 
dksneub in an answer, auch ss " Vocabulary ” in an esaay, if this 
is aeeorded a msrk separately from other marks accorded to other 
tiaamtt present in the essay. 

98. An exammer, when assessing the value of a unit piece of 
work may have a standard or model to which he refers. For 
ht at anse , in Biotation an examiner would have before him the 
ecfginal passage dictated, and in Arithmetic, he would have the 
aasawer t» a sinople sum. !bi other cases, such a model piece of 
wortt may not be avsdlable ; but the examiner may have clearly 
duSmid instraoticHM as to how many marks to allot to a certain 
mmwwff how many to take oS for a certain type of mistake, and 
MU' ML At cidier ^ssa, again, he may have neither a model nor 
{■Miiiw Jnstiwotkms to follow^ but be will have in his own wiind 

inrt d! ideal answer. 

M. Tim frtidlrflitimddttflerent marks being accorded to a unit 
|ltaa o( wode. by a number cf examinsts are o priori obvious. 
XvM wkan a jpwrifoct modal exkia to which reference may be 
oI IsMQdwritang may j^lve rise to disorepanoies ; 
wiml li to tme W Ba mta er may be legible to another. 

Wkaa tka pcafMil modal does not exiatk difierent sacaminers may 
Wid dUhasmi awwitetp iaio the wordk and phrases and j^ymbols 
iilMah li dm imiMarr aiaS 10 award Even when 

«aiM iMMWiria bilom «b» SKamber, it it oonaiste of a fair^ 
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lengthy collection of ivords, examiners may differ in their mdg- 
ment of what is “ like " and what is “ unlike ” the model. Again"*- 
the state of health of an examiner may have an effect on hia 
marking as time goes on ; his standards of what is perfection 
may alter, and Ids judgment may wobble. 

100. It is sometimes assumed that if two examiners allot the 
same average marks, and especially if they allot tlio same din* 
tribution of marks, to a group of seripts, their markings will bo 
identical throughout. Such rosemhlanccs may however oo>exist 
with a substantial diffemieo In the marks aivurdrvl to individual 
candidates ; for differences of the kind to which we have referred 
may be present, hut may cancel out when averagi-s arn takett. 
Thus, the average of two examiners, and their dihtrilmtion of 
marks may bo the same, but nevertheless the order in which they 
place candidates may be different. 

101, A practical illustration of the differences of examiners' 
marks is taken from the investigation on tlie Special Place 
Examination, English Paper B. The following ate the detailed and 
the total marks awarded by two examiners, B and X), to the ffnt 
ten candidates on the roll in this examination, for font questions. 
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In this illustoatlon the averages of the total marks ate the 
same, and the averages for the different qaestiona are pitacftioidly 
the same ; yet in only on© case are the marks exactly the stune, 
and in one cose they li^er by 7. The orders of marit are diffeient. 

103. Taking the evidence afforded by this series qf scripts 
marked by the two examiners, we might fairiy ^t they 
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both marked the four questions in this paper aooording to the 
'^eamd’ standards ; but the individual idiosjnoraoies of the two 
examiners are shown in the marks awarded to the candidates in 
respect of the various questions, and arc not entirely eliminated 
from the totate, which therefore exhibit discrepancies. 

103. It may happen that in addition to the kind of discrepancies 
noted in the foregoing illustration one examiner may on the average 
tend to award liighor marks than another examiner for each unit 
piece of work, so that bis average mark for a whole script wUl he 
higher. This kind of difference between two examiners will always 
be »v(‘al<Ki by an examination of average marks, but it may 
aocompany diserepanoios of the kind already referred to. 

104. The assumptions made and conventions used in this part 
of the analysis are as follows : — 

(a) Thatapioce of workiaworthadefinitenumberof marks inasoale. 

(b) That tlua mark would be allotted by the “perfect examiner.” 
We mdi this mark the “ ideal ” mark. 

(c) That every examiner attempts to discover this ideal mark 
bot may faff (i) because his standard ol marking differs from the ideal, 
and beoausB he introduces random variations into his marking.^ 

(isQ That an examiner who introduces a large random element 
isefco hia maridng is not as precise an examiner as one who intro- 
dnett a small random element into his marking. 

(e) That a first approximation to the ideal mark may he ob- 
taioed by taking the simple average mark of a number of 
ex a m fa en i ; wui that a approximarion may be obtained, 
if taka aeoonnt of the fact that some examiners are more 
pcedes than, others, and if we therefore use a “ weighted ” 
aPwrafa, the ** ws^ht ” of on examiner being inversely piopor- 
tionaf to the variance of his random variations. 

‘Thme aasumpHoos make it posable for us to split up any 
gnMirp>of tMudbi avanfed by mcaminers to a number of scripts into 


-7^ v i .' t: .. .j. ,.,', , ,i. j, . 1 ■ I. , 


(hnc» a Uw (MMibibir Uut tlu UidiT'idoal exsmiaer msp 
j i hsa sw m mmtmt hi (ha “apniidtiig” of tha kUsl msrki. 

1n0avssaiMIMts«a4'feal*Mk«tak*»da»nipw(d«ith 4ei(ii£tad tnatnuithnia 
MteeSMliia WMSt hw lwsti w i s (daUng to tlw ataodanla of Pam, VA Oradlt. 
Ii<yawa«e».sii4 ma aaa sil aiii nJ to tha Mad of wauntetog work whloh (toy hSTa 
«n ftoMsisik il aMS h» sajpuKi AM (to* a aat sui^ IthMlihaod ot dlfl^now dT 
Aha *e(H# la<5f liajillBiiiJ into tlat mnills of (bamacklng. Ba(, laantartotaatiUia 
itmm af IM* IM fat i laa wn afaia y, thtw aala <t oar data w«r« attbndt^ to a saw 
ae rt r a a ttoiiaost lt a i wa wD S lito . I»watfiwndthattltoj(Bata»{wB 0 h^ 
teiia i w hlw is t sgtha toi isi aa i toto i wt toto 
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(i) The ideal mark ; (ii) the amount by which tho examiner’s 
standard differs from tho ideal ; (iii) the random elrtaent ■% 
appropriate to that particular script. 

For an outline of the method by which tho ideal marks Jiaro 
been calculated, sec para. 133 below. 

106. Tho Hizo of tho random clement is estimated by means of 
tho standard deviation' of tho group of random varialioiu pirsont 
in the marks allottcfl by an oxaminor, and thw meoAitre can 
thoroforo be used to corapao* one examiner with another os to 
precision of marking, an examiner with a large standard dovbtiem 
being considered as less ptiiciso in his markuig than one with a 
smaller standard deviation. We can also compare one pa^mr in 
a subject with another paper, or one subject with another from 
the point of view of precision of marking by observing the dilfer- 
enoes between these standard deviations. 

School CeriificalA History 

107. We may illustrate the results of our procedure by (luoting 
tho appropriate components into which the marks are split up 
in the cases of Examiners B and H in the first invest^tion on 
School Certificate History. 
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* The standard deviation of a series of uuiubeis is the square el the 
avwage of the squares of the diffexeaoes of thej^i||flV|jjig^m their ararage, 
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Tbs mdom eletaent is larger mth Exaoaiiier B thaa ’with. 
Exo&iner H, and this is reflected in the higher standard deviation 
of lus random variations. 

108 . Tho standard deviations indicating the extent of the 
random element in marking at the tn-o investigations on School 
Certificate History ate given below. 
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llMitiroMtaof f^jtueein the above Table of standard deviations 
• 0 S of the same rize, and the oolnmna showing the order 

of Urn taaxBimt* aoomrdii^ to this criterion ore very similar. At 
'ti»M0fiui iirns4igation, those examiners with the smaller random 
V'loirtlOBS at the first marking allot marks which again have the 
MMMihe nsnfioni vackhthme on the whole, the correlations 
iMtWiNi tlwr two ordsvs above b^ng 0*66. As &r as can be 
jNid|M iNMns thie investiigation. the examiners show some 
‘M m t ri u mf itt the extaat o! thmr random variations on two 
dWtWfllll OMSNiMMI. 

Sfift. Ttw nli iirt ar d daviaikm may bo oontidored to indicate that 
• eensttdMMi IMI mMlt is,, say, 80, an examiner with a random 
VMMott tedMfi Iry a standard daviatlon of (say) S-S, wodd 

» • K... HI . . . _ 
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award a mark probably within a range of 4^ (twice the standard 
deviation) on either side of 50, i.e., bis mark would probably be % 
somewhere between 45i and 54|. Thun on one occasion he may 
give 61 marlos to such a script, on another 48 marks, on miother 
63 marks. Some of these standard deviations are quite hi^h 
(over 7 marks) indicating that an examiner with such a loose 
standard of marking may award, instead of 60 marks, a mark 
somewhere in tho range 35 to 66. Now in tliia kind of examination 
this range of marks would iuoludotho border luionmrkB for Pass and 
for Credit, 'rhus a candidate who is jawsibly worthy of a Credit 
may actually aohievo only a Pass or even be dubbed a Ifailato, 
or be may succeed in being given a mark of Credit instead of a Pass. 

1 10. extent of the variability amongst tho candidatos, duo 
to their differences in ability to answer questions in this subject, 
as jndged from the ideal marks, was 6-0 in the ffrst investigatkm, 
and 6*6 in the second. The standard deviations of the random 
variations are in the case of many examiners of this order of size, 
and it is quite conceivable that the difference in tbe standards 
of marking of the examines combined with the random variations 
whioh, in view of the sizes of the standard deviations, are likely 
to occur, would result in. all these candidates being awarded 
exaotiy same mark on some occasion. Actually this is what 
happened when the scripts wme first marked for the Examining 
Body. As stated in para. 5 above, the scripts ail received the 
same " middling ” mark. 


iScAaol Certifleaie Laiin. 

Ill, The Table below shews tbe standard deviations of the 
random variations of the two groups of examiners in Latin. 
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112, Tbift iarestigation gives material from which a comparison 
> is poftible ol the precision oi marking for the two parts of l^e 
Paper. Some examiners appear to be relatively more precise 
when marking Paper I (grammar, etc.) than Paper II (proscribed 
books), but with others tho contrary is the case, and it is doubtful 
if the evidence warrants the drawing of a general conclusion either 
way. The standard deviations are shown below. 

PdPSR t. (to timkt.] I PAPER IL (Maximum 60 matrit.) 
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Of the examiners of Group 1, A, B, E and E mark Paper I with 
inore prec^on than Paper II ; of the examiners of Group S, G, 
H, K and N moA Paper I with more precision than Paper II ; 
in one ease four oat of the six examiners, in the other four out of 
tbe Sttven marie Papw I with more precision than Paper II. 

School Gerlificate French 

lift. Tbe standard deviations of the random element in the 
kkdMdml esxamdoBis* final marks for the whole subject are shown 
l«krar, tcfivtiiier wHh tbe standard deviations of tbe two sets of 
ideal ttutrim 









AN EXAJJINATIOK OF EXAMlNATtONS 


40 


The extent of the random element is small compared with the 
amount of natural variation amongst the candidates, in the case 
of both sets of exammersi 

1J4. It is interesting to not© the effect of the random element, 
by comparing Examiner C’s marks with the ideal marks of 
Board I, the difterenco between C’s average and the ideal average 
being negligible, and by comparing Examiner J’s marks with the 
ideal marks of Board 11 , the difference between d's marks and 
the ideal average of Board Cl again being negligible, 

These two sots of marks are given below, togtdher with the 
classified results : - 
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U0y ttKKildfttM vho 06 olasQ is affected by tbe random 
■hwwsl tet XiMiDiaer Cfs marking are Nos. 4, 14, in, 16, 21, 27, 
811, 41, ift aQ. The details are as follows : — 
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Thus ft small difference of 1, 2 or 3 marks has the effect of 
making a difference in class in 5 cases. « 

Similarly the candidates whoso class is affected by the random 
element in Examiner J’s marking are Nos. 15, 18, 19, 25, 30, 36, 
39, seven in all. The details are as follows : — 


QttmdiSatt 

DiffittMt 
lietwen J '0 nark 
ami Ideal 

Idetd OUut 

j:k cifwc 

1 . 1 
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j ltai»ed + 
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j + 
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, U 1 
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96 
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i D j 

+ 

30 

-6 

c 

P ] 

— 

36 

+2 

p 

C i 
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39 

-6 

p 

P 

1 



iLgain a small difference of 1, 2 or 3 marks has the effect of 
making a difference in class in 4 cases. 

These two illustrations are typical of the effect of the random 
element on the class results. In each case the random element 
is fairly small (a standard deviation of about 2| marks out of 
100). In one case S candidates, and in the other case 7 candidates 
out of 50, have their class altered owing to the presence of the 
random element in the examiner's marks. 

116. An examination of the standard deviations of the exam- 
iners* random variations obtained when individual queettons in 
the papers are the subject of consideration reveals the foot timt 
some questions lead to more precise marking on the port of the 
examiners than others. For instance, answers to Qn. 1 of Paper £ 
receive more precise marking than answers to Qn. 2 of 
Paper U. 


School Oert^ficate Chmiittrjf 

U7. The standard deviations of the random marks are shown 
below:— 
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Standabd Dxmnom or Bahdou Vabutiohb 


BOASD I. 

SOASD IJ. 

ilxiiminw A • - < 

2-0 

Examiner Q - - 

6-6 

« B ■ - . 

4-0 

„ H . - 

3<I 

H C • - • 

2 -e 

„ J . . 

4*7 

„ B * • 

4*2 

.. K . - 

3-6 

« R • ■ 

4-0 

„ li . - 

2-7 

„ F • ■ 

3-6 

M . - 

2-8 

tMukdMd BevikOon of 


Standard Deviation of 


leoifcl Uwkx • 

18 -e 

Ideal Itlarke 



ie<8 


The random element is not rory pronounced, ranging from 
abont 2| to 6} marks in 100, It is higher than in the oorres- 
pondiD^ Fcenoh. examination, where the random marks had 
standard deviations ranging from 1*8 to 3*8. We may note that 
0*B random marks on the average are abont twice as large as 
tboee of L or M. 

118. One ai the chief reasons why the members of the two 
Boards ]daoed different numbers of candidates in the various 
grades, Dwtiootion, Credit, Pass, Pail, is that the two Boards on 
tbk ooeaston adopted different borderline marks for these 
9»di». 

jffeAooi Certificate English 

119. The randota variathins introduced into the marking are 
hidleated below (with a maximum mark =• 100) : — 




D 

1 ^ 

1 ° 

D 

R 

1 

F 

G 

OMatiaa • 



3-27 

3-84 

3-13 

8-00 

1 

4-87 


119, Tbe uarita awarded by the examiners to the seven 
t&miltim ib thie exa iutn atloc which were answered by the 
ttalwtly ^ the eaMdidatea were submitted to the same method 
ei aoiM^yrie with the (MhHa i^ven below, where for comparative 
patTpOMi each %pBa bai been redueed to a percentage of the 
MthasOn nMlk» pw qpacstkm. 
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Stanoaad Devutme 
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1 
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121. We may make eeveral observations on this table. In the 
first place, of the standard deviations of the ideal marks expressed 
as percentages of the maximum marks the least is that for the 
essay question. 

Secondly, comparing the average of ezominerB’ variations with 
the standard deviations of the ideal marks, we note that the 
former are greater than tho latter in the case of the Essay and 
Brdcis, and are less than the latter in the case of the other 
questions. 

Paper I deals with Essay and Precis ; and ^e marking of this 
Paper is less precise than the marking of Paper U, wbiob deals 
mainly with set^books. 

The total variation of the candidates’ marking may be regarded 
as due to a combination of their natural variation with the 
variation of tho examiners’ marks. Where the variation of the 
examiners is comparatively large, as in the marking of Paper I, 
the total variation is mainly due to tho variation of the examiner. 
Where it is smaller, as in the case of Paper II, the total variatimx 
is mainly due to the natural variation. 
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Special Place Extmination (11) : English Essay 

IZS, Our method of analysis enables us to give a reasonably 
clear answer to the question : — “ Is marking by details more 
precise than marking by impression ? ” We saw that the detailed 
mftfltifig gave on the whole higher average marks than marking 
by impression, but the random element appears to be present to 
the same d^ree in both types of marking. 

W3, The table below gives the Rtandard doviatiom of tlxe 
rtuidom variation* ; - 








Marking 

Marking 



ExomiiUT 



by 

by 







ImpreMion 

OebxvLt 
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0-0 
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6-fl 

8-2 

I. 


• 




6-3 

7-2 
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7-3 

0-6 
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• 




7-7 

7-9 

r 
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m 
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7-0 

0-3 


• 

• 

- 

- 

1 8-4 

i 

7.0 


9l«e mcambMos (A, C. Oi H, P) have less of the random element 
in ttMir tuuldk^ by details than in that by impr^on, 
vidtfib £bi« ctSm five ate mote precis when markmg by impression 
flWHk then Bunting by details. On the average there seems no 
pma& fat itBumtiiltig that either method of marking is better than 
teathcB firenk tbe potot of view of precision. 


0tiO(i$$ MtKhwM SeEolarship : EagUsh Essay 

tS<» WhiBMM Itli i^fiiManoai between the eicainineta' average 
iBMiMact the standards of markb^ 

ef the BKMMiinn we on the wliolft very little dljSerent from the 
iM the tnadiHHi vartiiiant are, on the other hand, rather 

i WiiPw ii ^ 

I 
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Standard deviations A, 8-8 ; B, 9*1 ; 0, 9«0 ; JD, 7'S ; E, 6«6. 

The large discrepancies hetween the different examiners in tins 
investigation are due more to the random element in the mftrj n n g 
than to any steady differences of standard, 

136. The data of this investigation worn further analysed with 
the object of discovering what influence, if any, the subject of the 
essay had on the resultant mark. There were four essay subjects, 
and the analysis showed that there were considerable differences 
between the marks awarded by different examiners to emays on 
different subjects. Thus the ave,rago of Examiner A's markii for 
the candidates who wrote on Subject No. 2 was 9 marks (<mt of 
190) more than that of Examiner 1>, but A’s average for Subject 
No. 4 was less than B's average for that subject. 

126. The fate of a candidate in this type oi examination U 
partly dependent on the particular examiner’s reaction to the 
subject of the essay. 

University Mathematical Honours. 

127. The standard deviations of the random variations in the 
marking are reduced when the examiners are grouped in pairs for 
the revision of the marks. The table below shows the standard 
deviations, based on a maximu m of 100. 


fMtiHMr 

A B 

C 

B 

£ 

F 

StsndMd CeTution • 

4-2 3-8 

4-0 

3-5 

4-1 

43 

Pair tif Xzaminen - , 

G 

H 

J 

8(«ttd«xd Usviation. 

2-3 

3-2 

30 


> 128. The differences between the different examtnm’ stamktds 
o£ marking ate not very groat, and tUeee wore mluced whan the 
revision took place ; but the differences of standartl still reroain- 
ing, coupled with the random element, would stUi have the 
effect that in certain oases the class awarded to a candidate 
would depend on the pair of examiners by whom be was 
marked, 
r. 
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University History Hmours 

I 2 S. The method whioh was used to estimate the size of the 
random elemeat iu the mao'king in the previous investigations is 
no longer possible of application in the present case as the marks 
are given in literal form. But by a modification of the method 
used we can get the relationship between the standard deviation 
of the random variations and the ideal marks for each examiner. 

130. We find comparatively large random variations present in 
the marks allotted by the examiners in this investigation, the 
ataodard deviations being in many cases greater than the standard 
deristion of the ideal marks. As there was a general consensus 
of opinion amongst the examiners that tbe oandidatee were on 
the whole below tbe first class, we can assume that the standard 
deviation of the ideal marks of each paper would he 10 out of 100 
on a numerical basis. On this assumption, the oorresponding 
average standard deviations of tihe random variations for the 
four papers of the examination would be Paper I, 12 ; Paper 
n, 17 ; Paper HI, 10 ; Paper IV, 9 marks (out of 100). Not 
too moeh preouion should be aooorded to these figures ; they are 
mainly estimated mth tbe idea of comparing the results of this 
inres^pstioo. witii the others where the marks were given 
nmnedmaOy and not literally. 

Swimary of the foregoing Sectiom. 

131. The folkmtag table gives average figures for standard 
dbrlatioH oi the random variations, two figures being given 
arhsce two Boards or two Choups of Examiners acted separately. 
In esah ease the marks are referred to a maximum of 100. 
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132. As zniglit have been expected, the most pieoise results are 
obtained in those examinations whore many detailed instruetiona 
ore given, and where the marking is the^ore stand^dised as 
much as possible, and the least preoision is obtained in the 
examinations of the essay type, where far more U loft to the 
judgment of the examiner. 

Method of Calculating Ideal Marke. 

133. Let ns call the marks awarded to the pieces of work written 

by n candidates by the several examiners Xj, X,, Z^, ... , where 
I takes all values from 1 to ra. We assume that the " ideal ’* 
mark appropriate to the piece of work of the t'th oandldate is 
Qf, end that Zf^=Qf\-C(, and so on,^ 

A, B, C, , being used to indicate the diSeienoes between tiie 
ideal marks and those awarded by the various examiners A, B, 
C, ... . 

The averages of thes^arioim marks f ur_t^ group of n oandi* 
dates are indicated by X, 7, Z, ... , Q, A, B, C, ... . 

Deviations from the averages are indicated hy small letters 

^4> Vo *0 > So ^0 ®<» ••• • 

Then we have a:,=g,+oj, yj=?t+6t, and similarly. 

Consider the pair gj+Oj, yi==gj+h<. We have 
**-!/*=«<— hj, 

and (««— P()*=sOt*+6j*— 2 a, 6|. 

Summing such identities for fml to n, givea 

assuming that ^(o, hj) =>0, an assumption whnd) depmids on the 
random dement in A’s marking being independent of the nuadom 
dement in B’s marking. 

* The further reSnemont referred to in the footnote to pam. IM ^ 9it 
would correspoad to a modifleation of this atsnmpthin. We thodl twnr 
atsosie 

9,+A,. ft+B,. Z,mr, g,+0„ 
and so on, the r’s being nraWpIkn diOering from one eaes wamr te 
another. The statietlcd analj^ is natazally modified in oeotefstNies* 
This aubjeot is disonssed in more detail in two meotoraiida by Fnnisanr 
CSyta Bust and by Dr. Bhodes in The Xorht «f Bmminm, > 
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Similarly we oaa obtain the equation, 

on a similar assumption. 

.... . ^1 m(«i— 1) , 

If there are m exammers, there are — ^ such equations. 


From these equations we oan estimate the most probable values 
of each of S{0(*)» <Sf(V), , because in each of these equations 

the right hand mde is known from the data. 

We obtain our results in this form : 


(}n-l)(«»-2) + (w-i)(m-2) 

where . 

*■ (m-1^^) »)+%*•)+-] + (m”l)'{w-2) 
and so on. 

Umbo estimates of d(a,*), d(V)> ••• » being proportional to the 
vatisnoes of the random element introduced by A, B, , into 
the maikiBg give us wdghts to,, u>j, ... , from whioh the ideal 
atmtls may be estimated. Thus 



APPENDIX I 


UNIVERSITY HISTORY HONOURS (DETAH^S OV 
INVESTIGATION) 

1. Charackr of Eaiamination Pa^er«.— Tine exskmmaMoQ pftpen 
were four in number, all forming port of a UniTersity Hivtoty 
Honours Examination. The subjects of the papers were m 
follows : — 

Paper I. Anoient and Medissyal History. 

Paper II. Mediseval and Modem History. 

Paper III. An Es8ay>paper with a choice from a numbw of 
subjects. 

Paper IV. Political Thought (Prescribed Books). 

In Papers 1, 11, and IV, candidates were requested not to attempt 
more than four questions out of a considerable number. The 
time allowed for each paper was three hours. 

2. Procedure . — The Universi^ concerned furnished us with all 
the scripts available in the subjects enumerated above from 
a recent Honours examination.^ Unfortunately 3 scripts (which 
happened to he among the best) had been aooidfintaUy deetreyed. 
The total number of aoripts available was 18 for Ptapet I, 17 lor 
Paper II, 18 for Paper IH, and 18 for Paper IV. 

VTlte examinatioii included a namher of other pspen, bat it wss thought 
that the field covered by these was auSteieot for the purpoes of the 
investigation. 



60 


AK KXAJnSTAnON 07 BSJUUIKATlOlirS 


The following 1? examiners took part in the marking of the 
scripts : — 

PaorassoB J. B. Black, M.A., Bumett-Plotohor Professor 
of History in the University of Aberdeen. 

pBoansson A. Bbowniko, M.A., D.Iiitt., Professor of History 
in the University of (Hasgow. 

ICb. Noxl BJBKfiOLM-YouNQ, M.A., Fellow of Magdalen 
CioUege, Oxford. 

pAoeessOB A. H, Boon, M.A., Professor of History in the 
University of Wales. 

Mb. D. L. Heib, M.A., Fellow of University College and 
University Leoturer in English Constitutional History, 
Oxford. 

Mb. Br. B. MoGallusi, M.A., Fellow and Leoturer in Modem 
Bhitary, Pembroke Collie, Oxford. 

Pbobbssob J. L. Mcffiisoir, M.A., B.Litt., Professor of 
Modem History, Armstrong College, University of 
Bodmm. 

PBoiaaKA R. B. Mowat, M.A., Professor of History in the 
ChiivecMty of Bristol. 

Mb. if. TH. it. Hvbbs, MA., Student and Tntor of Christ 
Ohmht Chtfded. 

Mb. M. J. PiBiAirr, MA.., Fellow of Sidney Sussex College, 
ChuiMdiiB' 

Mmi I. <}. PowBU., MjSl., Leoturer in History at the Royal 
l^afk/wny CMIage, University of London. 

BMilWKW Mcciwtr Powbb, MA., Blit., Professor of 
MwwiWfljb ISBsfosy hi ttie U nive r sity of Ltmdon. 
PwwBttiOti F, M. fomoBi, F.B A., Regius Professor 
«l Mote Blsteiy in the University of Oxford. 

Mil. 0. K. ihaiviMtilW MA., BWEtow of Uaivendty College 

Anoient History, Oxford, 
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Mb. C. 6. Stone, M.A., BaJUol CoUejj'e, Oxford. 

PaoFEssoB A. ]?. Basil Williams, O.B.E,, M,A., F.B.A.> 
Professor of History in the Univeraity of Edinburgh. 

Pboeessoe 0. H. Williams, M.A., Professor of History in 
the Unirersity of London. 

Ihe examiners aro designated A, B, C, . . . B, in what follows, 
but this designation does not oorrespond with the alpbabetioal 
order of the names. 

3. The scripts of Paper I were marked by 6 examiners j the 
scripts of each of the other papers by 10 examiners. The only 
reason for having the scripts of Paper 1 marked by fewer examinen 
was the difficulty in getting examiners to cover the two periods 
with which it dealt. 

As in other investigations, no indioafaon of origin or of the 
ctriginal marking appeared on the soripts, or was communicated to 
the examiners. 

Each examiner marked each individual question separately and 
gave a final mark for each script as a whole. 

4. The following “literal” system of marking, inoludii^ 
24 grades ranging from S to «+, was, after oonsnltation with an 
eminent historian, submitted to and approved by the great 
majority of examiners before the work began. It was conunnni- 
oatkl as approved to one or two examinera who oame into iit» 
investigation subsequently. 


TABLE 1 
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5. It may be well to eay a word here on the uae of a literal 
system o{ this kmd as oompared with the numerical systems 
employed in our other investigationB. The literal system is 
generally used at Oxford ; there is a considerable variety of usage 
in other Universities. 

6. There seems to be a fundamental difference, at any rate at 
the first blush, between the two systems. The literal system 
indicates only an order in classification, not ratios of proficiency. 
With that system, there can be no question of adding up mar^ 
for individnal questions in order to obtain a pcrcontago of a total 
nuudmum. It would appear that the literal mark indicates in 
the examiner’s mind a certain "quality.” The question of 
"quantity” probably enters into his estimate only in a sub* 
ordinate degree. 

With the numerioal system, on the other hand, the marks for 
individual questions are added up to furnish a total, a procedure 
wbioh is convenient, thou^ it is based on hypotheses which it is 
not perhapeea^ to analyse and justify. But any attempt to add 
together the symbols indicating " classes ” or " grades ” would 
teem a priori unjustifiable and would be rejected by many who 
tae lite^ marks. 

t. Both systems have their conveniences. It is for the sake of 
leaders who are unaecustomed to literal marking, and to enable 
timim to estimate by what nnmher of grades (or subordinate 
eh w iwl any two examiners dif^, that we have attributed the 
imutlMim 1 to S4 to the auoeeesive grades, 8 to a-f , and that, side 
#d» with the hteial tables, we have inserted numerioal 
tallae on tide baaia. But, for the reasons stated above, the 
wnsIwMi tndiwrting grades must not be regarded as numerical 
aaitdce. They ere ordinal numbera, not cardinal. 

It Bted n ti teMsrtomed to numerioal marking may further 
wiiih to Turn mm mwma of comparison between the two systems. 
A KRt|h end ready form of translation from one into the other 
iMtold be to s&ppoeB that eaoh the 24 litoral symbols ootre* 
ifMonde to a of fdoe marks, and the highest, a+, to 26. 

to exiwdmasrfeal tovaatigatioa could afford any real baais 
lev MMbi a ttotodMteit But it is oertain that such a difference 
ae toot ol 1$ itodMiy the maxjmum difference between the awards 
ef tms difiaeiut extaoiMn to the same amdpt in this investigation, 
xmok hm • diffatence of 72 in numerioal 
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marking, -with 96 (or 100) as a maximum mark, than a tlifferenoe 
of 18, which a superficial glance might suggest, 

9. An index of the examiners who marked the ■various papem 
is given in the Table below : — 

TABLE 2 


Paper 


Examiner \ 

1 It 1 

ir 

in 

1 

A 

1 _ 

« 

• 1 

1 * 

B 


* 

* 1 

1 * 

0 

! >- 

* 


1 _ 

D 

1 « 

- 


1 - 

1 * 

Vj 

V 1 

1 “ 


* 


G , 


- 


* 

H 



! * 1 

* 

J I 



* 1 

1 * 

K ' 

i « 


* 1 

1 • 

I' 1 

1 

• 

* 1 

1 * 

M ' 

‘ - 



i * 

N 


* 

• 


0 

« 1 

• { 

- 

1 

P 



- : 

1 - 

Q 


- 


• 

R 

” I 


* 

1 

— 


The papers marked by each examiner are indicated by an asterisk 
in the row corresponding to the letter by which he is designated. 
Thus Examiner B marked Papers II, HI and IV. 

10. In Tables 3, 4, 4a, 5 , 5a, 6 and 6a are set oat the literal 
marks assigned by the examinew to the scripts of each candidate, 
and the numerical representation of the corresponding gradea 
according to the convention explained in paras. 7 and 8 above. 
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Poftt II 




§T+ §T+ pt- 



tl 


60 



Aveiagell*0 




V naTi 












usue «* 



11-13 11-12 
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<y 

++ 

• 

& 


+ + + 

■f 

Ou 

hi 

+ 

ii±v«^+)iz+ laj-K 1 » + 

1 

4DU 

ht 

t 

+ II ++++++4i+++++i 

CO. « <o,<XL )«ca >• ^•fiCLea >»caflQ. 

4-tf 

M 

i+ 1 +w>.i'i’4- >-4-+ 1 1 

ccLQ^CELcQ. ^al.dl.A.«CLc& ^Qi.a^ca.(Qr tt 

+ 

OLOX 

e. 

+ 1 

j.++T f li i,g.Jl 1 i’i++^ 
ai.a.caca.<a. H cQ.(a.^QQ.oiau4aLGQ.ca»ca u 

t 

CQ. 


1 

<« +ii' + '»,*±i+7- 
S.S.taS.ta.SLS.co.S.'S'S.ca u £.«>.£ 


H 

1 

1 ^4- si Jl I 4-»i't- i'&l’B- 

ca.ai.co.«ca.aLcaca.A«ab >>ce. i) » 

+ 

Si 

n 

+ 1 i J, 4-4- 1 4-i 1 4- + 

oa.«tL<a.«Q.<a.tf. h » <n.A ^oL tf MLtaaa. 

& 


4- 4-4 

i 

, 



J 

t 

1 
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11 , A glance at the Tables shows certain general features of 
interest. Wo have a closeness of marking between certain ftTAminana 
and a wide difierence between others, not attributable to chance, 
but sliowing real and probably irreconcilable differences of standard. 

12. The examiners were asked to indicate what were their 
limits for a I'irst, a Second, and a Third Class. Not all replied 
on the point. In tho original scheme, a copy of which was furnished, 
to each examiner (see para. 4 ), there was a gap between pa 
and P4-+I and between and Py> >»ing tacitly imphed 
three classes. The following is a summary of information supplied 
by the examiners on the meaning of the symbols. 

A : — ap and bordcrliue. So also py and 9 fails. 

Bs— Nil. 

C : — ap is a fltst, px a second, is a tltiid class. S is a failure. 
Barely uses high u% or low marks, «.y. y's. 

D Does not use «+ or a1+, peifeotioa Is «. pa or -f is the best 
second class. He would have pnt fix at the titp of the second 
group. 

£ : — ap and pa are borderline marks, the formw indicating a first class 
paper with either one poor answer or one pexaistemt fault, 
the other a second class paper with one excellent anawer or 
one very sound q.uality. Similarly with other borderline 
marks. Failures are y— and S. 

F: — p« is top of second class, pt— is top of third. Sis failure. 

6 : — Pa is top of second, p=< is top of third class, ofi and fia are 
horderlino and p— t— is boiderbne. y-* P are fi^uns. 

H;— op and pa as in £. Pis failure. 

J First, second and third class as implied in the soheane sent o«t. 

K;~m 

Jj ap minimum for first olats. pa boidarline. Py mfadmitin for 
second, y^ borderline. 8 failure. 

Mt— «p minimum for first daea. pa borderhue. py and yp brndaclbie* 
8 fidlure. 

K ; ~ap minimum for first class. p«n seetmd i third, fiy to aast 
including 8. 

0 ! — As in E with qualifloatioa '* that ralue of bordertbuk otMlk* a# 
means of judging la that, if sevwal papers have to be etolWMt 
in the final result, the mixed or " border” marks birf# m 
addiUonal signifioanoe, pointing to tha need tcK bsgaltJP" 
They tvggtri quality. HanM I should peetoaslly avoid tms 
if only one paper was set on a sutrjsoA” * 
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P : — And pa borderline ai £. So vitb tiie pY yp, 

Q minimum tor first class, pa highest second. So with others. 

B:— 'ap and pa borderline, p— , p— t— , p= borderline, py highest 
third class. y~ ^ 

13. The exammois are not In suffident a^ireeznent on this 
point to uee their remarks as a basis for classification. In actual 
practice it is well-known that the limits are not determined in any 
purely mechanical way, but are the subjects of discussion in 
connexion with all border line oases. The subject of the present 
inFcstigation is not the actual award of First, Second and Third 
Ciassee at a History Honours examination, but the Fariation in 
the individual judgments which must scire as a basis for those 
awards. 

Althongh we cannot use the terms First, Second and Third 
eSasB, we can distinguish between the number of a’s, p’s, y’s, and 
3 *b and of borderlines. 

Thus the lowest limit for a First Glass most generally adopted 
is op ; but some are willing to consider Pa, the next grade, as a 
boidffljine for a Fitst, 

There is much more vatisHon in the opinions as to the lower 
limit of a Seooad Class : — 

P is adopted by F, 
p*a, by C, H, J, and N. 

Py» by Q- 

Some of the other examiners indicate that the borderline marks 
between seooad and third class are as follows ; — 
pn, Examiner R. 

Exaooiner Q. 

PY and Ypi Examiners A, E, H, F. 

YP> Bhciuiuner L. 

Wa have tkas a difiemnoe of seTeral grades between the highest 
end tihe lowest HmH adopted by the different examiners, 
la the TalhUa boksiw we tceet as a’s the grades from a4- to a>», 
as P’s Ite pwini fimm. p-k'l- to p«» ; as y’b the grades from y+ 
sp end pa are tnaied as bord^line esses between a and p : 
and ptf and as bordedine oasM between p and y. 

14 Wa gNw in Tabke 7 to lO below the dassifioatioin statisUca 
of 13tm Viiliotw aataokteMm on the forgoing basis, for the scripts 

hMMBIw 
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TABLE 7 


PAPER I (AneieiU A Medimal BiAoty) 


Mari 


Emminer 



D 

K 

0 P 

<3 



Number oj Atearde 


\ 

a 1 


■Bll 



Eorderline 


1 


... ^ 



15 


1 5 

Borderline 


— 


, 7 

Y 


2 


ft 

i 1 

HBH 



— 


18 

18 

18 ' 18 

18 


P 1 
PJ+ 1 

P+ 

p ' p~ 

YP 

Median 

(11-12) 

(13) 

I (11) j (9) 

(S) 


Thus Examiner D gives two candidates dear «’b, 4 candidates a 
borderline mark between a and 10 candidates 1 candidate 
and 1 candidate y- Q xetoms them all as p or worse, and no 
examiner uses S. 

16. TABLE 8 


PAPER II {Mtdimat and Modem Bidory) 


Mari 

A 

B 

C 

P 

Examiner 

H J 

JC 

L 

K 



■ 

■ 




Hj 


ct 

H 

■1 

2 

■M 

mm 

B 

**** 1 


Wm 

t 

Borderline 

■1 

Kl 

3 

4 : 

1 

mm 

3 



S 

P 

14 

19 

12 

12 

9 

IS 

12 i 

»« 

El 

» 

Retdttlino 

— 

— 

— 

1 

4 

1 

1 1 

■1 

HI 

t 

Y 

— ■ 

1 



8 

3 1 

2 i 


^3 

4MI 

8 

1 

— 

— 

— 

— 

IBI 

El 

Hi 


““ 


1 

17 

17 

17 

17 

17 

JU 

17 

17 

17 


i P+ 

P+t+ 

pr+ 

Pl+ 

pi- 

P- 

p+ i 

fit- 

» 

IM- 

Modion 

(13) 

(14) 

(12) 

(18) 

(10) 

(3) 

1 

(13) 1 

m 

(«) 

) 

(ID 


3 and L mark the noripts as ^ or worse, 0 a< ^ or better, 
A and N are the only cmes to use , 
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je. TAB!.® 9 

?APBB m (Btttty) 


Uark 




SfOminer 






A 

u 

F 

ll 

J 

K 

L 

N 

Q 

It 





Ntmber of 

Avarde 





1 

3 

3 

1 

i 1 

““ 

>... 

- 

i ^ 

3 

Bofierliitt 

2 

1 

3 

1 

1 1 

1 

2 

— 

3 

3 


M 

9 

12 

1.*) 

2 

12 

14 

1 

13 

3 

13 

3 

4 

0 

11 

y 

8 

"m 

1 0 


i 

1 

4 

2 

— 

1 

1 

El 

1 


18 

18 

18 

18 

IS 

18 

18 

18 

18 

18 


PJ+ 

P+ 1 
PJ+) 


P 

P*+ 

P»+l 

^ 1 
P»+J 

pl- 

Py 

P+ 

llaafau 

(U) 

(1243) 

(12) 

(U) 

(12) 

(11-12) 

(11-12) 

(10) 

(6) 

(18) 


K maxkA aU ^ oaadidates aa ^ or 'vtroise, and f leffcoms 'tinom. 
M p or bottoc, A and K again are the only examiners to use 


n. 


TABUS 10 

PABMB IV (Pvtifkal Theory) 


Mark 


Jhaminer 
g G ll 


Biemher ef Avorit 


s 

np 

ii 

E 

HB 

BB 

B 

D 

i 

■gHj 

1 

n 

H 

i 

; ^ M j 

18 

18 18 

18 

18 

16 

16 

16 


>«+ 1 JW-f 

Ilf 

t>+! »'+ 

P \ 

P+ 

P- 

p+ 

Py 


W j (»*l 

m 

|[11.13^ (19) 

(1149) 

(13) 

(8) 

(13) 

(8) 


LflHuAtti nib* wirti^ ^ c« woros, wMe S' and 0 mark them 
Hf ||i twilit. A lit Ih* Mcamtaw to ttse t. 
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18. We have the best basis for judging the dififeienoes between 
individual examiners if we consider the results of those who have 
marked three papers, i.e. A, B, F, H, J, K, L and Q ; Examiners 
A, B, F, H, J find clear a quality in some papers, whereas K, I- 
and Q never discover tins qualify. 

Again, B, H, J, K and Q discover clear y qualify in some papers, 
but A, F and J do not, though A discoveis 8 qualify in titree 
papers. (A and N are the only examiners who award a 8.) 

19. The averages (medians) of Q (yp for Fapor I and Py for 
Paper II and Paper III) differ fundamentally from the mt, 
all of which are in the range of p’s. Of theso oxaminers, B and !• 
may be regarded as tho extremes*, their averages (mediant) 
are set out below : — 


PAPER 

, 

! II 

! ! 

i JII 1 

1 

JV 


P+H- 

p+ t 1 

PI+ 

B 

(U) 

pj+ ) 
(13) ' 

(12) 

1 (12) 


pi- 

P li 

1 p— 

L 

1 

(10) 

1 

» 

1 

pt+ \ 
(11) 

(12) 

(0) 


Q differs definitely from all tiio other examiners ; and we 
a fairer piotnre of the differences likely to occur in staisdard if 
w© show the range of averages (medians) of the other examtoew 
for the four papers set out below. 



I 

PAPER 

II m 

n 

Bigkut .... 

(K)P+ 

(13) 

! (5)P+»+ 
(M) 

(B)p+ 

(13) 

m 

Lowul .... 

(P)P- 

(0) 

(J)P- 
(0) j 

tWPl- 

(W 

m 

XMfisnnee (Number of 
Sndee) .... 

1 

^ ! 

i 

1 

6 i 

! 

1 3 

1 

4 

I « 
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20. There is thus between these averages (medians) about 
four p«ides difference, from p+ to p— , corresponding to the 
familiar difference between 11 (i) and II (ii) of the Honours 
lists of some universities. We may say that there is between 
tho standards of these examiners about half a class difference, 
even leaving Q out of account, 

21. It is not surprising, if there ore such differences between 
the averages (medians), that we should ffnd much greater 
differences in the marking of individual scripts. 

For Paper I, Table 3 shows that Candidate No. 13 was awarded 
a by Examiner 0 and yp by Examiner P, a range of 17 
grades out of a possible range of 24. Q marks him Py, but 
both 1) and K mark Mm «p. 

For Paper II, Table 4 shows that Candidate No. 8 gets a — 
from B and y4- from J, a range of 16 grades, while Candidate 
Ho, 14 gets «— from B and yp from H, a range of 16 
grades. 

For Paper HI, Table 5 shows that Candidate No. 9 gets a 
from A, and y+ from B, a range of 18 grades ; while Candidate 
No. 8 gets « from R and Py from Q and N, a range of 16 
grades. 

For Paper IV, Table 6 shows that Candidate No. 8 gets « — 
from B and y-)- from J, a range of 16 grades. 

TMwe ranges are not affected by Q’s low marking. Moreover, 
the average isngea (again leaving Q out of account) ate as 
foiDowB'— 


For Paper X 
For Paper U 
for Paper m 
F« Paper IV 


7 grades 
11 grades 
10 grades 
9 grades 


Thm cat the average there is a whole class difference or thece- 
atMnrit betweso: the marks awarded by different eixaminers to 
tiia Sana icnH^, stnoo each dass may be supposed to comprise 
idkotttdgbagMm. 

In na aaaa 4om the same script get the same mark from all 
tiw wafihfiat Si. The donast appioa^ to equality is in ludging 
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the obviously very poor performance of Candidate No. 11 in 
Paper I ; he gets y two examiners and y- from the other 
three. 

The discrepancies between the marks awarded by the 
examiners whioh have been the subject of discussion in the 
preceding paragraphs may be considered to be due to two causes 
(1) constant differences of standard of marking on the part of 
examiners (2) the presence of on element of randomness in an 
examinor’s marking. 

These points are discussed in Part 11 above (sue p. 42 d eeq.). 



APPEi^DIX II. 


BBIIP SUMMARY OF THE WORK OF THE FRENCH 
INTERNATIONAL INSTITUTE EXAMINATIONS ENQUIRY 
(Oommission Frmfaiae pour I’Mnquile Carnegie, sur les emimm 
et concours en France). 


1. The French Committee^, ■who have received every assistance 
from the French Ministry of Public Instruction, have published 
a general report on French oxaminations, their oharaoter, the 
spirit by which they are inspired, and their relationship to the 
tiational system of education in the form of an AHas de I’mseigne- 
ment m France (in-quarto-raisin, pp. xiii, 183, 13 planches hors 
teocte, h Pafia, 4 la Maison du Livre, 4 Rue F^bien, 75 francs). 

2. They also issued a questionnaire to some 4,000 persons with 
regaxd to certain examinations, and ■will publi^ a summary of 
the replies. 

3. They have carried out a series of investigations on the 
bacecdofuiriat examination, in many ways similar to the investi- 
gadoona deectibed in the present pamphlet, and the results have 
been recorded in a volume entitled La correction des &prev/m 
tefita dam leg examem, enguiie eapirvmerdale sar le bacccAauriai 
^hrqtuaatO'Xaidn, 4 Paris, 4 la Maison du lavre, 4 Rue FSibien). 

4» IQlm first examinsiaon investigated by the Committee was 
tile ittBecdamied, becaiuae in their view this examination is both 


the miwt fypiosl and the most important of all the French 
e M i ndna fiooe, In the University of Paris alone there are about 
e enr Sdeitea a nnu a lly for the two parts of the bacoalaurSat. 
ISm enunlaatdon. esrree both as a school-leeving examination for 
tliia ]||)|)e4ift {bQ& for boys and girls) and as an entrance examina'tion 
Ini' naitwsltfic and to the liberal professions. " It is,” says the 
flwanh Omnsafttet, an inatarmfint of seleotion of what maybe 
mIMI th* direeiiitg ola i swe ” {Pinstrument de ailecHon dea claaaea 
llite 

****.d*y‘*** **»<>ft* I* >*ui.iwi 


iNSMMMMi el the fWHUh CkmmtttM is given on page 7 above, 
ll slew fimae tika ee nt sxt that tbe pbiaie '* dirwttng elasaes ’* is 
Wa4 liM te da i l iittSi t e ane ehuee ptivUegad by birth but those who 
iMlally aMMhv a dlMMttnt fatfituno* in the sooIbI system. The phrase 
!!** •anaa to tlie Report nt the Anxlxiaty Committee on 

HMMiiMl at tlw IwttM Btottetory Cemmleilon (may. 

• 
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5. The two parts of the French barcalaar6at corre'^pond, really 
spewing, to the examinations for the School Certificate And for 
the Higher School (Jcrtificate in Kngland. The fimt part is 
normally token at the age of about 10 by pupils of tho eUtsM de 
prmiire (formerly eolletl tho cfewac de rhitwiqiit). Tho second 
part is normally taken a year later by pupil'^ in two parallel 
classes, the clewe de philojtojphie and the eUme de maihimaliqtug. 
In those classes phOosophy is treated as the most important 
subject on the literary side, mathematics as tho most important 
on the floicntific side ; bat mathematics and other seienoe subjeeta 
are taught in tbo danse de philoaophie, while pliilusophy and other 
literary subjects are taught in the chme de nvdhhuafitjpttt. 

6. Both parts of the baceabmrfal inulude a written examination 
and a viva voce examination in a number of subjects. Only those 
who pass on the written examination ate admitted to the dm 
voce. A total aggregate of 60 per cent on the subjects of the 
written examination is required for a candidate to lie admi'wibte 
to the viva voce examination — ^it wouldappear, without aminimam 
requirement in any one subject. 

7. The following sammary is translated from the proofs of 
Chapter VIII of the volume 

(1) Two investigations have been undertaken by tbe Fienob 
Committee (Commission Frangaise Carnegie) on the marking dl 
scripts at the hoccohrurAtf examination. The chief invesHgation 
was undertaken with reference to the examinations in ; 
Translation from Latin (Fersion 
laiine) 

Frenob Essay {GompotUion 

fran^iee} [Part I of the inutdmdM 

English 
Mathematics 

I Part II of the imeakimtai 
Philosophy ' for pupils of the dtmt 

I dtpkUowpkk 
I Part 11 of horeohiurlat 
Physics for pupils of the «hu*e 

) de madimedAfm 

100 scripts corresponding to each of throe exaiatnatk«»t 
which had been actually written at the examinatkHw held iit 
Jnly, 1930, were corrected and marked by 8 csauaiaMS 
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(correcieurs) choson from -fehe panel of examiners for the baeca- 
laurSat (the actual mark of the examiner at the baecalauriat 
examination furnishing a sixth mark). 

The scripts chosen formed a sufficiently t 3 T)ioal sample of 
the baccaUturiat scripts as a whole. 

A supplementary investigation was made on throe French 
essays [copies de composition frangaiae), selected from thoxc 
used for the principal investigation, whioh were corrected and 
marked by 70 difierent examiners. 

(2) The maximum ranges^ of the marks attributed to one and 
the same script in the first investigation by the different 
examiners were as follows : — 

12 marks out of 20 for Latin translation [60 per cent.] 

13 marks out of 20 for French Fssay [66 per cent.] 

9 marks out of 20 for English [46 per oent.] 

9 marks out of 20 for Hathematios [45 per cent.] 

12 marks out of 20 for Philosophy [60 per cent.] 

8 marks out of 20 for Physics [40 per cent.] 

The mean differences between the marks of two examiners 
varied from 1*88 out of 20 in Physios 9-40 per cent,] to 
8-36 out of 20 in Philosophy [i.e., 16*80 per cent.].* The 
number of the differences between two examiners equal to 
or hif^er than 6 mocks out of 20 (26 per cent.) was 2*6 per oent. 
in Phyries and 23 per oent. in Plffiosophy. 

(3) The number of scripts which were recorded as deserving an 
ovarafe mark or a mark higher than the average in the opinion 
cl some of the examiners (but not of aU) was as follows : — 

lat^n trsnsla* 

otUm ... 60 per cent, of the total number of the scripts 

Framh Essay 70 per cent, of the total number of the scripts 

EQigh^ ... 47 per oent. of the total number of the scripts 

Mathemaries 36 per cent, of the total number of the scripts 

PfaikMophy ... 81 per cent, of the total number of the scripts 

Phyrioi ... 60 per cent, of the total number of the scripts 

ttbe tens 'nutge’' it vaid. m in the text of this pamphlet, to denote 
!9bs dtSSmiee bseasea tbs highest and lowHt marks altotted by dillerent 
ssaMssm m the leiM s^ 

iIUNmMiSMi bMhMMa easii pair of sxamioeis for each eandidate 
mm tss tort iii t id , IMm Msg, itith six sxaminen, 18 diftetenoes in 
SWpM idleeiltaswIiMe, sisd I^IOO for aaeh sabject, 


AIT SXAMmATIOK OS SSXAMmATIOITS SI 

8. For the second investigation on the Ftenoh. Essay three 
sciipte, Nos, 23, 23 and 34, were selected, each of which' at the 
original baccalawriat examination had been awarded 36 marks 
out of 80 (or 45 per cent.) and had been ranked as 24th out of a 
batch of SO. These three scripts were marked independently by 
76 examiners. The marks for script No, 23 varied from 4 to S2, 
for script No. 23 from 12 to 64, and for script No. 34 from 10 to 
56 out of a maximum of 80. The mean marks for the three scripts 
were as follows : Script No. 25-- 2.1'y ; Script No. 25— 40’0 ; 
Script No. 34 — 34'4. 

9. The hook contains an elaborate statistical analysis <yf the 
relations between the marks of the diSeront examiners, from 
which the following may be quoted 

After reduction by means of appropriate corrections of the 
scales of the different examiners to the same levri of seventy 
(by reducing to the same average) and to the same distributkm 
(by altering the marks so that they have the same standard 
deviation), there still remain important differences betwemt the 
results of the pairs of examiners. The correlation between the 
marks of two examiners was never perfect, with a value of 
f = 1, and was as low as f « 0*112 (conelaticm between the 
marks of Examiner C and Examiner I) in Philosophy for 60 
scripts of women candidates). The mean correlation ooefficlent 
of ^ the examiners taken in purs varies from r » 0*420 in 
Philosophy (scripts of women candidates) to r ■» 0*888 iit 
Matisemaiios (scripts of male candidates). 





